我正在尝试用 python 编写一个脚本,使用 selenium 从该网站检索一些信息。 Cloudfare 似乎阻止了该脚本,因为它是一个机器人。这里我贴出部分代码:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.wait import WebDriverWait
options = webdriver.ChromeOptions()
options.add_argument("--disable-blink-features=AutomationControlled")
options.add_argument("--incognito")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option("useAutomationExtension", False)
prefs = {"profile.managed_default_content_settings.images": 2,
"profile.default_content_setting_values.notifications": 2
}
options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(options=options)
driver.get("https://worldwide.espacenet.com/")
advanced_search = WebDriverWait(driver, 10).until( lambda x: x.find_element(By.XPATH, "/html/body/div/div/nav/ul/li[5]/label/span")) #toggle to be triggered to enter in advanced search mode
advanced_search.click()
Cloudfare 立即屏蔽该页面并要求用户验证他是人类。是否有一些方法可以让网站认为它正在与真实用户交互或直接在 selenium webdriver 中添加一些选项?
我还尝试添加一个在窗口上随机移动鼠标的脚本
使用 unDetected-chromedriver 库启动浏览器并与其交互。我已经对其进行了测试,它跳过了 Cloudflare 人工检查。
代码如下:
import time
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
import undetected_chromedriver as uc
driver = uc.Chrome()
driver.get("https://worldwide.espacenet.com/")
advanced_search = WebDriverWait(driver, 10).until(lambda x: x.find_element(By.XPATH,
"/html/body/div/div/nav/ul/li[5]/label/span")) # toggle to be triggered to enter in advanced search mode
advanced_search.click()
time.sleep(15)