我正在运行一个 Python 脚本来获取用逗号分隔的 URL,并根据名为“Properties.csv”的文件检查它们,以确定它们是否在其中,如果是,则提供一些信息;但是,如果该项目不在文件中,那么它将使用
socket
库来查询信息。
我可以很好地运行脚本来自行查看文件并提取信息(下面脚本中的
if
语句),但是当我引入 else
语句时,它会失败;但是,if
语句中的信息位于文件内,但我在 except
语句中的 else
中不断收到“无法映射”消息。
我的输入是一个 URL 列表,如下所示:
url0-function.domain.com,url1.com,url2.subdomain.domain.com,url3.domain.com,url3-function.domain.com,url4.domain.com,url4-function.domain.com,url5.domain.com,url5-function.domain.com
我的期望的输出应该是:
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url0-function.domain.com', 'hostname': 'subdomain.function-domain.provider.net', 'ipaddr': '100.10.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url2.domain.com', 'hostname': 'url2.sudbdomain.domain.com', 'ipaddr': '100.20.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'www.domain.com', 'hostname': 'url2.sudbdomain.domain.com', 'ipaddr': '100.20.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url2.domain.com', 'hostname': 'url2.sudbdomain.domain.com', 'ipaddr': '100.20.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'www.domain.com', 'hostname': 'url2.sudbdomain.domain.com', 'ipaddr': '100.20.30.1'}
{'url': 'url3-function.domain.com', 'hostname': 'url3-function.domain.com', 'ipaddr': '100.30.30.1'}
{'url': 'url3-function.domain.com', 'hostname': 'url3-function.domain.com', 'ipaddr': '100.30.30.1'}
{'url': 'url3-function.domain.com', 'hostname': 'url3-function.domain.com', 'ipaddr': '100.30.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url4-function.domain.com', 'hostname': 'url4.subdomain-function-domain.com.provider.net', 'ipaddr': '100.50.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url4-function.domain.com', 'hostname': 'url4.subdomain-function-domain.com.provider.net', 'ipaddr': '100.50.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url4-function.domain.com', 'hostname': 'url4.subdomain-function-domain.com.provider.net', 'ipaddr': '100.50.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url4-function.domain.com', 'hostname': 'url4.subdomain-function-domain.com.provider.net', 'ipaddr': '100.50.30.1'}
Unable to map url5.domain.com moving on...
Unable to map url5-function.domain.com moving on...
但是,当我存在
else
语句时,我会在下面得到不需要的输出,跳过整个 if
语句,即使 if
语句 should 首先被命中并且确实有匹配项:
Unable to map url0-function.domain.com moving on...
{'url': 'url1.com', 'hostname': 'url1.com', 'ipaddr': '100.20.30.1'}
{'url': 'url1.com', 'hostname': 'url1.com', 'ipaddr': '100.20.30.1'}
{'url': 'url2.subdomain.domain.com', 'hostname': 'url2.subdomain.domain.com', 'ipaddr': '99.99.99.99'}
{'url': 'url3.domain.com', 'hostname': 'url3.domain.com', 'ipaddr': '100.30.30.1'}
{'url': 'url3.domain.com', 'hostname': 'url3.domain.com', 'ipaddr': '100.30.30.1'}
{'url': 'url3-function.domain.com', 'hostname': 'url3-function.domain.com', 'ipaddr': '100.30.30.1'}
{'url': 'url3-function.domain.com', 'hostname': 'url3-function.domain.com', 'ipaddr': '100.30.30.1'}
{'url': 'url3-function.domain.com', 'hostname': 'url3-function.domain.com', 'ipaddr': '100.30.30.1'}
{'url': 'url3-function.domain.com', 'hostname': 'url3-function.domain.com', 'ipaddr': '100.30.30.1'}
Unable to map url4.domain.com moving on...
Unable to map url4-function.domain.com moving on...
当我删除
else
语句时,我得到了这个不需要的输出,因为它将按预期检查文件,并且 if
语句将正常工作,但我需要的脚本的另一半现在丢失了因为我删除了 else
声明:
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url0-function.domain.com', 'hostname': 'subdomain.function-domain.provider.net', 'ipaddr': '100.10.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url2.domain.com', 'hostname': 'url2.sudbdomain.domain.com', 'ipaddr': '100.20.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'www.domain.com', 'hostname': 'url2.sudbdomain.domain.com', 'ipaddr': '100.20.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url2.domain.com', 'hostname': 'url2.sudbdomain.domain.com', 'ipaddr': '100.20.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'www.domain.com', 'hostname': 'url2.sudbdomain.domain.com', 'ipaddr': '100.20.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url3.domain.com', 'hostname': 'url3.sudomain-function.domain.com.provider.net', 'ipaddr': '100.40.20.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url4.domain.com', 'hostname': 'url4.subdomain-function-domain.com.provider.net', 'ipaddr': '100.50.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url4.domain.com', 'hostname': 'url4.subdomain-function-domain.com.provider.net', 'ipaddr': '100.50.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url4-function.domain.com', 'hostname': 'url4.subdomain-function-domain.com.provider.net', 'ipaddr': '100.50.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url4-function.domain.com', 'hostname': 'url4.subdomain-function-domain.com.provider.net', 'ipaddr': '100.50.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url4-function.domain.com', 'hostname': 'url4.subdomain-function-domain.com.provider.net', 'ipaddr': '100.50.30.1'}
{'config_ref': 'SomeInternalConfigRef', 'hostname_ref': 'url4-function.domain.com', 'hostname': 'url4.subdomain-function-domain.com.provider.net', 'ipaddr': '100.50.30.1'}
任何关于如何使这项工作达到我想要的输出的建议将不胜感激,谢谢。 这是我的脚本:
import re
import socket
from urllib.parse import urlparse
#
defined_url = input("What is the URL being queried for the mapping? DO NOT include the sheme (e.g., http://), DO NOT include a port, DO NOT include a path (e.g., /somepath/file.php), DO NOT include parameters (e.g., ?param1=foo¶m2=bar), and DO NOT include an anchor (e.g., #foobarinthisdoc). Enter each URL separated by a comma and do not use any spaces: ")
defined_url_array = []
defined_url_array.append(defined_url)
for defined_urls in defined_url_array:
url_array = defined_urls.split(",")
url_array_len = len(url_array)
index = 0
properties_file = r"A:\\File\\Path\\Properties.csv"
#
while index < url_array_len:
#
defined_url_regex = re.compile(r"(?P<url>" + url_array[index] + r")")
with open(properties_file, "r") as file_obj:
for url_match in defined_url_regex.finditer(file_obj.read()):
file_obj.seek(0)
url = url_match.groupdict()
for row_obj in file_obj:
if url["url"] in row_obj:
config_ref = row_obj.split(",")[1].strip()
hostname_ref = row_obj.split(",")[2].strip()
hostname = row_obj.split(",")[3].strip()
ipaddr = row_obj.split(",")[4].strip()
properties_dictionary = {"config_ref":config_ref,"hostname_ref":hostname_ref,"hostname":hostname,"ipaddr":ipaddr}
print(properties_dictionary)
break
else:
hostname_parse = urlparse(url_array[index])
try:
ipaddr = socket.gethostbyname(hostname_parse.path)
mapping_dictionary = {"url":url_array[index],"hostname":hostname_parse.path,"ipaddr":ipaddr}
print(mapping_dictionary)
break
except socket.gaierror:
print("Unable to map", hostname_parse.path,"moving on...")
break
index += 1
else:
print("Script completed.")
print("Script reviewed", url_array_len,"URLs.")
我可能不应该回答这样的问题,但无论如何......
这部分
for url_match in defined_url_regex.finditer(file_obj.read()):
意味着只有当url在文件中时你才会到达if语句。
如果我正确理解你的逻辑,这可以用更短更简单的方式重写:
url_array = input("What ...:").split(",")
# url_array = ['url1.in_csv_domian.com','url2.not_in_csv_domain.com']
properties_file = r"Properties.csv"
with open(properties_file, "r") as file_obj:
content = file_obj.read()
for url in url_array:
if url in content:
print("properties from file")
else:
print("get from socket")
print("Script completed.")
print("Script reviewed", len(url_array), "URLs.")
对于如下所示的 csv 文件:
url1.in_csv_domian.com,url2.in_csv_domain.com