Selenium Web 抓取 SSL 连接问题

Question

我正在尝试使用 Selenium 执行网页抓取。因此，我想要从中抓取数据的网站需要身份验证。所以，我的目标是登录网站并抓取一些用户相关的数据。

因此，首先我尝试登录该网站，我导航到 https://my.pitchbook.com/，网站自动将我重定向到以下链接： https://loginprod.morningstar.com/loginstate=hKFo2SA1NkY0R2IwakMyYVFwTXNGSF8zampvSVRwU21abWhOZqFupWxvZ2luo3RpZNkgX0NacFNHQTFfc29iOC1lckFEc3JGaFRaWHBNZkJ1Rk2jY2lk2SByWUMwT1V4 SDRpV05jbXpPanVwQjh6UnN0dWtlZXZyUg&client=rYC0OUxH4iWNcmzOjupB8zRstukeevrR&protocol=oauth2&redirect_uri=https%3A%2F%2Fmy.pitchbook.com%2Fauth0%2Fcallback&source=bus0155&response_type=code

将我重定向到上述链接后，会出现一个登录页面，我正在尝试登录该网站。 但是，我收到错误：

我尝试找到错误的解决方案，我什至编写了以下代码：

chrome_options.add_argument('--ignore-certificate-errors')  # Disable SSL verification

代码：

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

# Set up Chrome options
chrome_options = Options()
chrome_options.add_argument('--ignore-certificate-errors')  # Disable SSL verification

# Set up WebDriver service
service = Service(executable_path="chromedriver.exe")
driver = webdriver.Chrome(service=service, options=chrome_options)

# Navigate to the initial page
driver.get("https://my.pitchbook.com/")

try:
    # Increase the wait time
    wait = WebDriverWait(driver, 20)  

    # Wait for the URL to change after redirection
    wait.until(EC.url_changes("https://my.pitchbook.com/"))


    # Wait for the login form to load on the redirected page
    email_element = wait.until(EC.presence_of_element_located((By.ID, "emailInput")))  
    password_element = wait.until(EC.presence_of_element_located((By.ID, "passwordInput")))  
    login_button = wait.until(EC.element_to_be_clickable((By.CLASS_NAME, "mds-button___ctrsi"))) 

    # Enter credentials (replace with actual credentials)
    email_element.send_keys("email")
    password_element.send_keys("password")
    login_button.click()

    # Wait for the login process to complete
    time.sleep(5)  # Adjust as necessary


    # Now you are authenticated, navigate to the desired page or interact with elements
    driver.get("https://my.pitchbook.com/dashboard/home")

    # Locate the element you want to interact with
    user_name = wait.until(EC.presence_of_element_located((By.CLASS_NAME, "button__caption_a580497eb72793758caf95a9250e8342")))
    value = user_name.text

    print(value)

    # Wait before closing
    time.sleep(5)

finally:
    # Close the browser
    driver.quit()

您的帮助将不胜感激！（我是网页抓取和 Selenium 的新手）

Answer 1

似乎有些地方需要纠正。首先，在代码中使用 try 和 catch 语句。这样，如果有任何错误，它会纠正并继续。

其次，请记住，当存在阻止安全连接到您的网站的问题时，就会发生 SSL 握手错误（我知道，错误名称很有趣）。最有可能的是，您可能安装了防病毒软件。如果是，请将其禁用。

或者，尝试运行相同的代码，但使用 Mozilla Firefox 或 Opera 或 Safari 或任何其他安全浏览器来运行。

谢谢你！保重！

Selenium Web 抓取 SSL 连接问题

问题描述投票：0回答：1

1个回答

最新问题

Selenium Web 抓取 SSL 连接问题

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1