如何在AWS ec2中使用chromedriver(python)?

问题描述 投票:2回答:1

我需要一些关于学习python web scraping的小项目的帮助。

Traceback (most recent call last):
File "ridi_find.py", line 5, in <module>
driver = webdriver.Chrome(chromedriver)
File "/home/ubuntu/play_python/venv/lib/python3.5/site-   packages/selenium/webdriver/chrome/webdriver.py", line 69, in __init__
desired_capabilities=desired_capabilities)
File "/home/ubuntu/play_python/venv/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 92, in __init__
self.start_session(desired_capabilities, browser_profile)
File "/home/ubuntu/play_python/venv/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 179, in start_session
response = self.execute(Command.NEW_SESSION, capabilities)
File "/home/ubuntu/play_python/venv/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 236, in execute
self.error_handler.check_response(response)
File "/home/ubuntu/play_python/venv/lib/python3.5/site-packages/selenium/webdriver/remote/errorhandler.py", line 192, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally
(Driver info: chromedriver=2.9.248304,platform=Linux 4.4.0-53-generic x86_64)

我安装了chromedriver(linux64)并在ec2 AWS服务器(ubuntu)中使用python 3.5.2。

这是源代码。但这不行。

from selenium import webdriver
import pandas as pd

chromedriver = '/home/ubuntu/play_python/venv/bin/chromedriver'
driver = webdriver.Chrome(chromedriver)
driver.get('http://ridibooks.com/')

find_some = input("what do you want to know?")
find_some = find_some + '\n'

search = driver.find_element_by_css_selector("input[id='book_search_input']")
search.send_keys(find_some)
searches = driver.find_element_by_id('books_contents')

book_lists = []

for l in searches.find_elements_by_css_selector("span.title_text"):
    book_lists.append(l.text)

easy_index = pd.Series(0, index = range(1, len(book_lists) +1))
book_series = pd.Series(book_lists, index = easy_index.index)

print(book_series)
driver.quit()

任何人都可以帮我解决这个问题吗?

python 3.5.2

Ubuntu 16.04.1 LTS

Chromedriver_linux64

ubuntu amazon-ec2 selenium-chromedriver
1个回答
2
投票

我正在研究类似的问题,看起来你需要一个像xvfb这样的“虚假”x环境:

sudo yum install xorg-x11-server-Xvfbunzip
wget -O /tmp/chromedriver.zip http://chromedriver.storage.googleapis.com/2.10/chromedriver_linux64.zip && sudo unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/;

其次,看起来AWS没有提供一些库:https://forums.aws.amazon.com/message.jspa?messageID=713847

将以下内容添加到/etc/yum.repos.d/centos.repo

[CentOS-base]
name=CentOS-6 - Base
mirrorlist=http://mirrorlist.centos.org/?release=6&arch=x86_64&repo=os
gpgcheck=1
gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-6
#released updates
[CentOS-updates]
name=CentOS-6 - Updates
mirrorlist=http://mirrorlist.centos.org/?release=6&arch=x86_64&repo=updates
gpgcheck=1
gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-6
#additional packages that may be useful
[CentOS-extras]
name=CentOS-6 - Extras
mirrorlist=http://mirrorlist.centos.org/?release=6&arch=x86_64&repo=extras
gpgcheck=1
gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-6

下次运行:

sudo rpm --import http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-6 

最后:

sudo yum install GConf2

编辑: 我的错,但过程类似。由于您使用的是ubuntu,只需更改您获取软件包的位置:

sudo apt-get install python-pip
sudo apt-get install xvfb xserver-xephyr vnc4server
sudo pip install pyvirtualdisplay

我的代码看起来像这样:

display = Display(visible=0, size=(1300, 1080))
display.start()
driver = webdriver.Chrome()
driver.set_window_size(1300, 1080)
driver.get("https://www.google.com")
...
driver.close()
driver.quit()
display.stop()
© www.soinside.com 2019 - 2024. All rights reserved.