我对所有关于url请求链接的帖子感到非常困惑,我只能靠我自己不能解决它。我正在尝试从网页中获取一些信息,然后打开一个新的“ a href”,在其中存储我想要的更多信息。
from bs4 import BeautifulSoup
import requests
from csv import reader, writer, DictWriter, DictReader
source = requests.get("http://www.bda-ieo.it/test/Group.aspx?Lan=Ita")
soup = BeautifulSoup(source.text, "html.parser")
titolo_sezione = ""
table_row = ""
with open("genere.txt", "w", newline="") as txt_file:
headers = ["GRUPPO MERCEOLOGICO", "CODICE MERCEOLOGICO", "ALIMENTO"]
csv_writer = DictWriter(txt_file, fieldnames=headers, delimiter=';')
csv_writer.writeheader()
for table_row in soup.find("table", id="tblResult").find_all("tr"):
className = ""
if table_row.get("class"):
className = table_row.get("class").pop()
if className == "testobold":
titolo_sezione = table_row.text
if className == "testonormale":
for cds in table_row.find_all("td"):
url = cds.get("a")
urls = requests.get("http://www.bda-ieo.it/test/Groupfood.aspx?Lan=Ita + url")
dage = BeautifulSoup(urls.text, "html.parser")
alimenti = ""
for alimenti in dage:
id_alimento, destra = alimenti.find_all("td")
codice = id_alimento.text
nome = destra.text
href = destra.a.get("href")
print(f'{titolo_sezione}; {id_alimento.text}; {nome.text}')
变量网址不会再打开任何页面。有人可以帮我弄清楚吗?我对此感到困惑。
谢谢质量
您需要重新整理其中的一些逻辑,以及阅读一些有关字符串格式的知识。我记下了进行更改的位置,但不确定要输出的内容到底是什么,但这可能会使您前进。