我有以下html:
<div id="aod-price-1" class="a-section a-spacing-none a-padding-none">
<span class="a-price" data-a-size="l" data-a-color="base">
<span class="a-offscreen">$79.58</span>
<span aria-hidden="true">
<span class="a-price-symbol">$</span>
<span class="a-price-whole">
"79"
<span class="a-price-decimal">.</span>
</span>
<span class="a-price-fraction">58</span>
</span>
</span>
</div>
我正在尝试提取 79.58 美元。
我用过:
priceFound = WebDriverWait(browser,10).until(EC.presence_of_all_elements_located((By.XPATH, "//span[@class='a-price']")))
这似乎有效,但并不完全符合我的预期:
返回:
$79
58
2 行分开,无小数
我正在尝试提取完整的文本字符串:$79.58
我什至尝试过:
priceFound = WebDriverWait(browser,10).until(EC.presence_of_all_elements_located((By.XPATH, "//span[@class='a-offscreen']")))
和
priceFound = WebDriverWait(browser,10).until(EC.presence_of_all_elements_located((By.XPATH, "//span[@class='a-price-whole']")))
那2个没用。
根据迄今为止的建议进行更新:
请注意,
priceFound
是一个列表,并且有几个像上面实际html中的块一样的块(许多价格)。
<div id="aod-price-1" ... </div>
<div id="aod-price-2" ... </div>
<div id="aod-price-3" ... </div>
<div id="aod-price-4" ... </div>
为了清楚起见,我刚刚发布了一个区块,这就是我选择列表的原因。
priceFound = WebDriverWait(browser, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//span[@class='a-price']/span[@class='a-offscreen']")))
for price in priceFound:
print(price.text)
返回:几个空行(具体为空回车)
我想知道 XPath 中的某个位置是否需要 .text 引用?
更新2:
我使用以下命令单击“查看所有购买选择”按钮。它确实有效。 然后稍微等待一下,等待价格上涨。
Expand_button_Element = browser.find_element_by_id("buybox-see-all-buying-choices")
Expand_button_Element.click()
更新3:
wait = WebDriverWait(browser, 10);
# wait for panel to be visible
wait.until(EC.visibility_of_element_located((By.ID, "aod-container")))
# this wait is probably no longer needed but left in to be safe
priceFound = wait.until(EC.visibility_of_all_elements_located((By.XPATH,"//span[@class='a-price']")))
for price in priceFound:
print(price.text)
`
产生(作为示例): 77 美元 23 77 美元 24 79 美元 59 79 美元 94 78 美元 95 83 美元 94 79 美元 99 79 美元 95 79 美元 99 89 美元 00
但是当我尝试下面的代码建议时:
browser.find_element_by_id("buybox-see-all-buying-choices").click()
wait = WebDriverWait(browser, 10);
# wait for panel to be visible
wait.until(EC.visibility_of_element_located((By.ID, "aod-container")))
# this wait is probably no longer needed but left in to be safe
priceFound = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.a-price > span.a-offscreen")))
for price in priceFound:
print(price.text)
我收到以下错误:
priceFound =
wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.a-
price > span.a-offscreen")))
File "/home/codingArea/.local/lib/python3.8/site-packages/selenium/webdriver
/support/wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
这似乎是一个类似的问题:
我认为这肯定会起作用:
priceFound = wait.until(EC.visibility_of_all_elements_located((By.XPATH,"//span[@class='a-price']/span[@class='a-offscreen']")))
但它超时了,这对我来说毫无意义。 我确保我正在运行最新的硒。
更新4: 我使用了以下内容,它没有错误,但它产生了空行(如 10 个回车符)。
priceFound = browser.find_elements_by_css_selector('span.a-offscreen')
for price in priceFound:
print(price.text)
您尝试了多种方法,但没有发布每种方法的结果。因此,作为基线,让我们从一些简单的事情开始,看看它是否可以解决问题。如果没有,我们可以从那里开始工作。
让我们使用一个简单的 CSS 选择器并添加一个可见性等待。 (注意:您使用的是存在,但这仅意味着该元素位于 DOM 中,而不是它可见)
# this code starts after clicking link to open product panel with pricing
browser.find_element_by_id("buybox-see-all-buying-choices").click()
wait = WebDriverWait(browser, 10);
# wait for panel to be visible
wait.until(EC.visibility_of_element_located((By.ID, "aod-container")))
# this wait is probably no longer needed but left in to be safe
priceFound = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.a-price > span.a-offscreen")))
for price in priceFound:
print(price.text)
product.find_element(By.XPATH, ".//span[@class='a-price']//span[@class='a-offscreen']").get_attribute('textContent').strip()
尝试以上方法。