如何在 Firefox 中进行正确的硒网络抓取?

问题描述 投票:0回答:1

我正在尝试做一个硒程序,我正在尝试做一个基本的网络抓取,我可以获得网站的标题。然而,每当我尝试使用已安装所有网络驱动程序的 selenium 进行最基本的网络抓取时,我都会收到一堆我无法解释的错误,而且我仍然无法在网上的任何地方找到问题的解决方案。

这是我在 vs code 中输入的代码:

from selenium import webdriver

PATH = "C:\Program Files (x86)\geckodriver.exe"
driver = webdriver.Firefox(PATH)

driver.get("https://www.bestbuy.com")
print(driver.title)
driver.quit()

这是我收到的错误响应:

c:\Program Files (x86)\VS Code\Product Scalper.py:3: SyntaxWarning: invalid escape sequence '\P'
  PATH = "C:\Program Files (x86)\geckodriver.exe"
Traceback (most recent call last):
  File "c:\Program Files (x86)\VS Code\Product Scalper.py", line 4, in <module>
    driver = webdriver.Firefox(PATH)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\victo\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\selenium\webdriver\firefox\webdriver.py", line 57, in __init__
    if finder.get_browser_path():
       ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\victo\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\selenium\webdriver\common\driver_finder.py", line 47, in get_browser_path
    return self._binary_paths()["browser_path"]
           ^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\victo\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\selenium\webdriver\common\driver_finder.py", line 56, in _binary_paths
    browser = self._options.capabilities["browserName"]
              ^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'str' object has no attribute 'capabilities'
python selenium-webdriver web-scraping error-handling
1个回答
0
投票

使用原始字符串作为

PATH
,就像这样
r"C:\Program Files (x86)\geckodriver.exe"
。普通字符串中的反斜杠用于特殊字符,例如
\n
。您还可以使用双反斜杠,它仍然有效
"C:\\Program Files (x86)\\geckodriver.exe"

© www.soinside.com 2019 - 2024. All rights reserved.