我将在 Linux 上通过 selenium 进行爬行。
但是,出现了错误消息。
File "craw_after1day.py", line 202, in <module>
driver = webdriver.Chrome('/home/ec2-user/linux_chromedirver/chromedriver',options=options)
File "/home/ec2-user/.local/lib/python3.7/site-packages/selenium/webdriver/chrome/webdriver.py", line 73, in __init__
self.service.start()
File "/home/ec2-user/.local/lib/python3.7/site-packages/selenium/webdriver/common/service.py", line 98, in start
self.assert_process_still_running()
File "/home/ec2-user/.local/lib/python3.7/site-packages/selenium/webdriver/common/service.py", line 111, in assert_process_still_running
% (self.path, return_code)
selenium.common.exceptions.WebDriverException: Message: Service /home/ec2-user/linux_chromedirver/chromedriver unexpectedly exited. Status code was: 127
尝试解决此错误的方法
import json
import time
import requests
import pymysql
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument('lang=en')
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--single-process')
options.add_argument('--disable-dev-shm-usage')
# fake-user-agent를 추가
options.add_argument('user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/5$
# options.add_argument('lang=en') 와 같이 써줘야함. (바인딩 언어 옵션)
options.add_experimental_option('prefs', {'intl.accept_languages': 'en,en_US'})
driver = webdriver.Chrome('/home/ec2-user/linux_chromedirver/chromedriver',options=options)
如何在 Linux 上运行 selenium?
请帮助我。
如何在 Linux 中安装 chrome 驱动程序
为此,您需要一个无头浏览器。
1) 您需要安装 Chrome 二进制文件
# Install Chrome.
sudo curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add
sudo echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list
# Update our system
sudo apt-get -y update
# Install Chrome
sudo apt-get -y install google-chrome-stable
2)之后,您需要安装Chrome驱动程序。
# Install Chromedriver
wget -N https://chromedriver.storage.googleapis.com/95.0.4638.54/chromedriver_linux64.zip -P ~/
unzip ~/chromedriver_linux64.zip -d ~/
# Remove zip file
rm ~/chromedriver_linux64.zip
# Move driver to bin location
sudo mv -f ~/chromedriver /usr/local/bin/chromedriver
# Give it rights
sudo chown root:root /usr/local/bin/chromedriver
sudo chmod 0755 /usr/local/bin/chromedriver
3)安装Selenium
# Install Selenium
pip install selenium
4)你已经准备好了,只需测试一个脚本即可。
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
# Set path Selenium
CHROMEDRIVER_PATH = '/usr/local/bin/chromedriver'
s = Service(CHROMEDRIVER_PATH)
WINDOW_SIZE = "1920,1080"
# Options
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--window-size=%s" % WINDOW_SIZE)
chrome_options.add_argument('--no-sandbox')
driver = webdriver.Chrome(service=s, options=chrome_options)
# Get the response and print title
driver.get("https://www.python.org")
print(driver.title)
driver.close()
您现在已经在 Linux 中运行了 selenium。 该脚本应该给你:
>>> Welcome to Python.org
留意 chrome 驱动程序版本,该版本应与 chrome 网络浏览器的版本匹配。请阅读这里
你可以使用
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(executable_path=ChromeDriverManager().install())
它将始终在最新的 chromedriver 上运行您的代码。
看起来像
chrome driver mismatch issue.
从这里
获取最新版本并像下面这样使用它:
driver = webdriver.Chrome(executable_path='/path/to/chromedriver', options=options)
driver.get("http://www.python.org")
对于驱动程序和 Chrome 版本问题,我们可以在此链接上查看最新版本:最新可用的跨平台 Chrome 列表