Selenium：在 Python 脚本中加载页面后元素消失，但在个人浏览器中则不然

Question

在 Python 脚本中使用 Selenium 与网页交互时遇到一个特殊问题。具体来说，我正在尝试从此网页中抓取数据。

Python代码：

from selenium import webdriver
from selenium.webdriver.common.by import By
import time

link = "https://www.homedepot.com/p/Ejoy-20-in-H-x-20-in-W-GorgeousHome-Artificial-Boxwood-Hedge-Greenery-Panels-Milan-12-pc-Milan-12pc/314160722"

# Launch a Chrome browser
options = webdriver.ChromeOptions()
driver = webdriver.Chrome(options=options)
driver.get(link)

# Wait for the page to load
time.sleep(3)

# Find the drop-down element
dropdown = driver.find_element(By.XPATH, "//*[@id='product-section-specifications']")
dropdown.click()

# Attempt to find elements within the drop-down
headers = dropdown.find_elements(By.CSS_SELECTOR, "div[class='kpf__name']")
print(len(headers))
print(headers[0].text)

我遇到的问题是，虽然页面似乎加载正确，但某些元素（特别是下拉列表中的元素）在我与它们交互之前就消失了。奇怪的是，当我在个人 Chrome 浏览器中手动打开同一网页时，所有内容都会按预期加载并保持可见。

我最初怀疑脚本可能会删除内容，但在比较 Selenium 和我个人浏览器之间的控制台输出后，它们显示相同的“错误”输出，使我相信这可能不是问题的原因。

Answer 1

该网站启用了某种反机器人保护。现在很多网站都这样做，以防止网站被抓取等。

话虽如此，我重写了你的脚本，使其更干净、更高效，以防你需要一些改进建议......

如果您不打算使用它们，则无需定义
```
ChromeOptions()
```
。您可以简单地使用
```
driver = webdriver.Chrome()
```
请勿使用
```
time.sleep()
```
。这不是一个好的做法，被认为是“愚蠢的”等待。即使该元素较早可用，它也始终等待指定的时间。使用
```
WebDriverWait
```
并等待特定条件，例如如果要单击某些内容，则等待可点击；如果要以其他方式与元素交互，则等待可见。

考虑到这一切，这是重写的脚本

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

url = "https://www.homedepot.com/p/Ejoy-20-in-H-x-20-in-W-GorgeousHome-Artificial-Boxwood-Hedge-Greenery-Panels-Milan-12-pc-Milan-12pc/314160722"

# Launch a Chrome browser
driver = webdriver.Chrome()
driver.get(url)

wait = WebDriverWait(driver, 10)

# Open the Specification accordion element
wait.until(EC.element_to_be_clickable((By.ID, "product-section-key-feat"))).click()

# Find elements in the Specification section
headers = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.kpf__name")))
print(len(headers))
for header in headers:
    print(header.text)

Selenium：在 Python 脚本中加载页面后元素消失，但在个人浏览器中则不然

问题描述投票：0回答：1

1个回答

最新问题

Selenium：在 Python 脚本中加载页面后元素消失，但在个人浏览器中则不然

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1