Python + Selenium：网络抓取

Question

我正在尝试使用 Selenium 从网站中提取一些信息，下面是该网站的链接： http://www.ultimatetennisstatistics.com/playerProfile?playerId=4742 我想要获取的信息是玩家统计信息，该信息位于下拉按钮“统计”中，该按钮会将您带到另一个页面我已经检查了该按钮并获得了 XPath 和 CSS，但是当我运行我的程序时，它不会打开玩家的统计信息相反，它只需打开下面的链接： http://www.ultimatetennisstatistics.com/playerProfile?playerId=4742

并给我一个错误：

NoSuchElementException: no such element: 

Unable to locate element: {"method":"css selector","selector":"#playerPills > li.dropdown.active.open > ul > li.active"}
  (Session info: chrome=67.0.3396.99)
  (Driver info: chromedriver=2.41.578737 (49da6702b16031c40d63e5618de03a32ff6c197e),platform=Windows NT 6.3.9600 x86_64)

下面是我的代码：

from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Chrome()
driver.get("http://www.ultimatetennisstatistics.com/playerProfile?playerId=4742")
soup = BeautifulSoup(driver.page_source,"lxml")

bm = driver.find_element_by_css_selector('#playerPills > li.dropdown.active.open > ul > li.active')
bm.click()

有人可以告诉我们如何使用 Selenium 打开玩家的统计页面并提取表中的信息吗？

Answer 1

如果您检查页面的 html 源代码，您可以直接访问要单击的按钮的 CSS id。使用selenium，您可以通过执行

driver.find_element_by_id('statisticsPill')

来通过其 id 找到按钮，这将允许您单击它以显示表格。
加载后，您可以解析表格以获取您想要的数据。

示例：

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("http://www.ultimatetennisstatistics.com/playerProfile?playerId=4742")

try:
    # Fist click on the dropdown
    dropdown = driver.find_element_by_xpath("//a[@id='statisticsPill']/../../..")
    dropdown.click()

    # Then click on the statistics button
    bm = driver.find_element_by_id('statisticsPill')
    bm.click()
except NoSuchElementException as e:
    # Do error handling when cannot find the button

编辑：您必须首先单击下拉菜单以使按钮可见，然后单击它。

Python + Selenium：网络抓取

问题描述投票：0回答：1

1个回答

最新问题

Python + Selenium：网络抓取

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1