为什么我无法抓取所有数据？

Question

通过此流程，我尝试从特定网站抓取所有数据。主要问题与流程的输出有关，因为我没有收到所有主队的列表，而只收到第一场比赛的主队名称。我该怎么做才能从网站接收所有数据？

from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Chrome(executable_path=r"C:\Users\Lorenzo\Downloads\chromedriver.exe")
driver.get('https://www.diretta.it')
html = driver.page_source 
soup = BeautifulSoup(html,'lxml')
games = soup.find_all('div', class_ = 'event__match event__match--live event__match--last 
event__match--twoLine')
for game in games:
home = soup.find('div', class_ = 'event__participant event__participant--home').text
away = soup.find('div', class_ = 'event__participant event__participant--away').text
time = soup.find('div', class_ = 'event__time').text
print(home)

Answer 1

您正在循环游戏，但没有将其用作循环内查找的对象。

home = game.find('div', class_ = 'event__participant event__participant--home').text

Answer 2

首先，当使用selenium时，你不需要美丽的汤，因为你可以使用

find_elenet_by

来查找标签和

find_elements_by

（带有s的元素。复数），以获得具有相似实体的所有标签的列表。

您的代码将是：

from selenium import webdriver

driver = webdriver.Chrome(executable_path=r"C:\Users\Lorenzo\Downloads\chromedriver.exe")
driver.get('https://www.diretta.it')

games = driver.find_elements_by_css_selector('div[class = "event__match event__match--live event__match--last event__match--twoLine"]')

for game in games:
    home = game.find_element_by_css_selector('div[class = "event__participant event__participant--home"]').text
    away = game.find_element_by_css_selector('div[class = "event__participant event__participant--away"]').text
    time = game.find_element_by_css_selector('div[class = "event__time"]').text
    
    print(home)

为什么我无法抓取所有数据？

问题描述投票：0回答：2

2个回答

最新问题

为什么我无法抓取所有数据？

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2