python selenium刮多页表

Question

此代码的目的是从特定URL中抓取多页数据表。并且它不再适用于第一行。

这是代码：

from selenium import webdriver


class DataEngine:
    def __init__(self):
        self.url = 'https://www.investing.com/economic-calendar/house-price-index-147'
        self.driver = webdriver.PhantomJS(r"D:\Projects\Tutorial\Driver\phantomjs-2.1.1-windows\bin\phantomjs.exe")

    def title(self):
        self.driver.get(self.url)
        title = self.driver.find_elements_by_xpath('//*[@id="leftColumn"]/h1')
        for title in title:
            print(title.text)

    def table(self):
        self.driver.get(self.url)
        while True:
            table = self.driver.find_elements_by_xpath('//*[@id="historicEvent_372690"]')
            for table in table:
                print(table.text)

Answer 1

要确保代码擦除页面上的所有行，请更新xpath

//*[@id="historicEvent_372690"]

至

//*[contains(@id,"historicEvent_")]

您当前使用的xpath只读取第一行。我共享的xpath使用contains关键字来查找包含id historicEvent_的所有元素

python selenium刮多页表

问题描述投票：0回答：1

1个回答

最新问题

python selenium刮多页表

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1