我想从多个域中提取 URL 并将唯一的输出值保存到一个 txt 文件中。 URL 有不同的格式,有些有 http、https、127.0.0.1。我只想获取 URL 并删除前缀,特别是“127.0.0.1”,我尝试了以下 ps 脚本,但它没有给我任何结果。请帮忙解决这个问题。
`$threatFeedUrls =@("https://raw.githubusercontent.com/DandelionSprout/adfilt/master/Alternate%20versions%20Anti-Malware%20List/AntiMalwareHosts.txt",
"https://osint.digitalside.it/Threat-
Intel/lists/latestdomains.txt")
#Initialize an array to store all extracted URLs
$allUrls = @()
#Loop through the lists of URLs
foreach ($url in $threatFeedUrls) {
# Download the threat feed data
$threatFeedData = Invoke-RestMethod -Uri $threatFeedUrl
# Define a regular expression pattern to match URLs starting with '127.0.0.1'
$pattern = '127\.0\.0\.1 ([^\s]+)'
# Use the regular expression to find matches in the threat feed data
$matchList = [regex]::Matches($threatFeedData, $pattern)
# Create and populate the list with matched URLs
$urlList =
foreach ($match in $matchList) {
$match.Groups[1].Value
}
# Specify the output file path
$outputFilePath = 'output250.txt'
# Save the URLs to the output file
$urlList | Out-File -FilePath $outputFilePath
Write-Host "URLs starting with '127.0.0.1' extracted from threat feed have been saved to $outputFilePath."
}`
我编写了 PS 脚本来提取所有 URL。但输出并不是我所期望的。我想从列出的域中提取所有 URL,删除重复项并将它们保存在一个 txt 文件中
你可以试试这个:
# Define the URLs to get
$threatFeedUrls = @(
"https://raw.githubusercontent.com/DandelionSprout/adfilt/master/Alternate%20versions%20Anti-Malware%20List/AntiMalwareHosts.txt",
"https://osint.digitalside.it/Threat-Intel/lists/latestdomains.txt"
)
# Get all the raw files
$Result = $threatFeedUrls | foreach {Irm -Uri $_ -UseBasicParsing}
# Filter out comments and empty lines
$OnlyInterestingLines = $Result -split "`n" | where {$_ -notmatch "^(#|\s|$)" }
# Remove 127.0.0.1 at the beginning of lines followed by any amount of whitespace, sort it and return only unique addresses
$urlList = $OnlyInterestingLines -replace "^127\.0\.0\.1\s*" | Sort-Object -Unique
# Specify the output file path
$outputFilePath = 'output250.txt'
# Save the URLs to the output file
$urlList | Out-File -FilePath $outputFilePath
结果是 21.780 行主机名