所以我试图从网站上删除一些信息,当我尝试通过xpath获取元素时,当我提供的路径直接从检查工具中复制时,我收到错误“无法找到元素”。我尝试了几件但是它没有用,所以我告诉自己我将尝试更简单的路径(TEST),但仍然无法工作。在检查时,网站是否可能没有显示所有的HTML代码?
这是代码,我尝试了网站和xpath。
URL_TRADER = 'https://www.tipranks.com/analysts/joseph-foresi?benchmark=none&period=yearly'
TEST = 'html/body/div[@id="app"]/div[@class="logged-out free"]/div[@class="client-components-app-app__wrapper undefined undefined"]'#/div/div[1]/div/div[2]/div/section/main/table/tbody/tr[3]/td[3]/div/div/div/div[1]/span'
X_PATH = '//*[@id="app"]/div/div/div[2]/div/div[1]/div/div[2]/div/section/main/table/tbody/tr[1]/td[3]/div/div/div/div[1]/span'
主要功能是:
def trader_table():
# Loading Chrome and getting to the website
driver = webdriver.Chrome(executable_path='/usr/local/bin/chromedriver')
driver.get(URL_TRADER)
driver.implicitly_wait(10)
text = driver.find_element_by_xpath(X_PATH).get_attribute('innerHTML')
return text
我添加了一个等待条件并使用了css选择器组合,但这与我认为的xpath相同
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
url = 'https://www.tipranks.com/analysts/joseph-foresi?benchmark=none&period=yearly'
driver = webdriver.Chrome()
driver.get(url)
data = WebDriverWait(driver,10).until(EC.presence_of_element_located((By.CSS_SELECTOR, ".client-components-experts-infoTable-expertTable__table .client-components-experts-infoTable-expertTable__dataRow td:nth-child(3)"))).get_attribute('innerHTML')
print(data)
您已经提供了构建答案所需的所有必要详细信息,但您没有明确提到您要获取的元素。
但是,在TEST
中注释掉的xpath给了我们一个提示你是在价格目标之后并提取这些元素中的文本,因为元素是JavaScript启用的元素,你需要为visibility_of_all_elements_located()
引入WebDriverWait,你可以使用以下解决方案:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument('start-maximized')
options.add_argument('disable-infobars')
options.add_argument('--disable-extensions')
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get("https://www.tipranks.com/analysts/joseph-foresi?benchmark=none&period=yearly")
print([element.get_attribute('innerHTML') for element in WebDriverWait(driver,10).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='client-components-experts-infoTable-expertTable__isBuy']//span")))])
['$14.00', '$110.00', '$237.00', '$36.00', '$150.00', '$71.00', '$188.00', '$91.00', '$101.00', '$110.00']
我想你正在照顾price
。
from selenium import webdriver
URL_TRADER = 'https://www.tipranks.com/analysts/joseph-foresi?benchmark=none&period=yearly'
TEST = 'html/body/div[@id="app"]/div[@class="logged-out free"]/div[@class="client-components-app-app__wrapper undefined undefined"]'#/div/div[1]/div/div[2]/div/section/main/table/tbody/tr[3]/td[3]/div/div/div/div[1]/span'
X_PATH = "//div[@class='client-components-experts-infoTable-expertTable__isBuy']/div/span"
def trader_table():
driver = webdriver.Chrome(executable_path='/usr/local/bin/chromedriver')
driver.get(URL_TRADER)
driver.implicitly_wait(10)
text = driver.find_element_by_xpath(X_PATH).get_attribute('innerHTML')
print(text)
return text
编辑所有行
from selenium import webdriver
URL_TRADER = 'https://www.tipranks.com/analysts/joseph-foresi?benchmark=none&period=yearly'
X_PATH = "//div[@class='client-components-experts-infoTable-expertTable__isBuy']/div/span"
def trader_table():
driver = webdriver.Chrome(executable_path='/usr/local/bin/chromedriver')
driver.get(URL_TRADER)
driver.implicitly_wait(10)
list_ele= driver.find_elements_by_xpath(X_PATH)
price_list = []
for ele in list_ele:
print(ele.text)
price_list.append(ele.text)
return price_list
list=trader_table()
print(list)
from selenium import webdriver
import time
driver = webdriver.Chrome("your webdriver location")
driver.get("https://www.tipranks.com/analysts/joseph-foresi?benchmark=none&period=yearly")
time.sleep(10)
y = driver.find_element_by_id('app').get_attribute('innerHTML')
print(y)
打印完整的内部HTML