如何配置 Dockerfile 以使用 Selenium 运行 Chrome?

问题描述 投票:0回答:1

我正在尝试设置一个 Docker 容器,以便在 Digital Ocean 上使用 Google Chrome 运行 Python Selenium 脚本。我的 Dockerfile 旨在安装 Google Chrome 和 Chromedriver,以便我的 Selenium 脚本可以从网页中抓取数据。但是,我在尝试安装 Google Chrome 时遇到错误。

# Use a lightweight Python base image
FROM python:3.10

WORKDIR /app

COPY . /app

RUN pip install --trusted-host pypi.python.org -r requirements.txt

# install google chrome
RUN apt-get update && \
    apt-get install -y wget gnupg && \
    wget -q -O - https://dl.google.com/linux/linux_signing_key.pub | gpg --dearmor -o /usr/share/keyrings/google-linux-signing-key.gpg && \
    echo "deb [arch=amd64 signed-by=/usr/share/keyrings/google-linux-signing-key.gpg] http://dl.google.com/linux/chrome/deb/ stable main" > /etc/apt/sources.list.d/google-chrome.list && \
    apt-get update && \
    apt-get install -y google-chrome-stable

# install chromedriver
RUN apt-get install -yqq unzip && \
    wget -O /tmp/chromedriver.zip "http://chromedriver.storage.googleapis.com/$(curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE)/chromedriver_linux64.zip" && \
    unzip /tmp/chromedriver.zip -d /usr/local/bin/

# set display port to avoid crash
ENV DISPLAY=:99

CMD ["python", "app/scraper.py"]

我得到的错误是: E:无法找到包 google-chrome-stable

我尝试过的:

  • 我使用 gpg 更新了密钥安装并按照某些答案中的建议配置了 Google Chrome 存储库 URL,但错误仍然存在。
  • 我尝试在 Chrome 安装步骤之前手动运行 apt-get update 以确保更新软件包列表,但仍然没有成功。
  • 我已尝试本教程中的解决方案:https://www.youtube.com/watch?v=xrYDlx8evR0&t=541s

我尝试过的另一个解决方案是:

# Use a lightweight Python base image
FROM python:3.10-slim

# Avoid interactive prompts during package installations
ENV DEBIAN_FRONTEND=noninteractive

# Install system dependencies required by Chrome and ChromeDriver
RUN apt-get update && apt-get install -y \
    chromium-driver \
    chromium \
    fonts-liberation \
    libnss3 \
    libx11-6 \
    libatk-bridge2.0-0 \
    libatspi2.0-0 \
    libgtk-3-0 \
    libxcomposite1 \
    libxcursor1 \
    libxdamage1 \
    libxrandr2 \
    libgbm-dev \
    && apt-get clean && rm -rf /var/lib/apt/lists/*

# Set environment variables for Chromium and ChromeDriver paths
ENV CHROME_BIN=/usr/bin/chromium \
    CHROME_DRIVER_BIN=/usr/bin/chromedriver

# Set the working directory in the container
WORKDIR /app

# Copy the requirements.txt file to the working directory
COPY requirements.txt .

# Install Python dependencies from the requirements file
RUN pip install --no-cache-dir -r requirements.txt

# Copy the rest of the application code to the working directory
COPY . .

# Expose the port your application will run on
EXPOSE 8080

此解决方案在部署过程中会导致此错误: 文件“/workspace/app/scraper.py”,第 21 行,位于 驱动程序 = webdriver.Chrome(选项=chrome_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 文件“/workspace/.heroku/python/lib/python3.12/site-packages/selenium/webdriver/chrome/webdriver.py”,第 45 行,在 init 中 超级()。init( 文件“/workspace/.heroku/python/lib/python3.12/site-packages/selenium/webdriver/chromium/webdriver.py”,第 55 行,在 init 中 self.service.start() 文件“/workspace/.heroku/python/lib/python3.12/site-packages/selenium/webdriver/common/service.py”,第 105 行,在 start 中 self.assert_process_still_running() 文件“/workspace/.heroku/python/lib/python3.12/site-packages/selenium/webdriver/common/service.py”,第 118 行,位于assert_process_still_running raise WebDriverException(f“服务 {self._path} 意外退出。状态代码为:{return_code}”) selenium.common.exceptions.WebDriverException:消息:服务 /workspace/.cache/selenium/chromedriver/linux64/130.0.6723.69/chromedriver 意外退出。状态代码是:127

我尝试过的:

  • 我使用 gpg 更新了密钥安装并按照某些答案中的建议配置了 Google Chrome 存储库 URL,但错误仍然存在。
  • 我尝试在 Chrome 安装步骤之前手动运行 apt-get update 以确保更新软件包列表,但仍然没有成功。
  • 我已尝试本教程中的解决方案:https://www.youtube.com/watch?v=xrYDlx8evR0&t=541s

我尝试过的另一个解决方案是:

# Use a lightweight Python base image
FROM python:3.10-slim

# Avoid interactive prompts during package installations
ENV DEBIAN_FRONTEND=noninteractive

# Install system dependencies required by Chrome and ChromeDriver
RUN apt-get update && apt-get install -y \
    chromium-driver \
    chromium \
    fonts-liberation \
    libnss3 \
    libx11-6 \
    libatk-bridge2.0-0 \
    libatspi2.0-0 \
    libgtk-3-0 \
    libxcomposite1 \
    libxcursor1 \
    libxdamage1 \
    libxrandr2 \
    libgbm-dev \
    && apt-get clean && rm -rf /var/lib/apt/lists/*

# Set environment variables for Chromium and ChromeDriver paths
ENV CHROME_BIN=/usr/bin/chromium \
    CHROME_DRIVER_BIN=/usr/bin/chromedriver

# Set the working directory in the container
WORKDIR /app

# Copy the requirements.txt file to the working directory
COPY requirements.txt .

# Install Python dependencies from the requirements file
RUN pip install --no-cache-dir -r requirements.txt

# Copy the rest of the application code to the working directory
COPY . .

# Expose the port your application will run on
EXPOSE 8080

此解决方案在部署过程中会导致此错误: 文件“/workspace/app/scraper.py”,第 21 行,位于 驱动程序 = webdriver.Chrome(选项=chrome_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 文件“/workspace/.heroku/python/lib/python3.12/site-packages/selenium/webdriver/chrome/webdriver.py”,第 45 行,在 init 中 超级()。init( 文件“/workspace/.heroku/python/lib/python3.12/site-packages/selenium/webdriver/chromium/webdriver.py”,第 55 行,在 init 中 self.service.start() 文件“/workspace/.heroku/python/lib/python3.12/site-packages/selenium/webdriver/common/service.py”,第 105 行,在 start 中 self.assert_process_still_running() 文件“/workspace/.heroku/python/lib/python3.12/site-packages/selenium/webdriver/common/service.py”,第 118 行,位于assert_process_still_running raise WebDriverException(f“服务 {self._path} 意外退出。状态代码为:{return_code}”) selenium.common.exceptions.WebDriverException:消息:服务 /workspace/.cache/selenium/chromedriver/linux64/130.0.6723.69/chromedriver 意外退出。状态代码是:127

selenium-webdriver web-scraping dockerfile selenium-chromedriver
1个回答
0
投票

我曾经按照 dockerfile 中的以下步骤在 Linux Ubuntu 机器上安装 chrome。你可以试试这个

      RUN apt-get update
      RUN mkdir -p /var/lib/apt/lists/partial
      RUN apt install -y libsm6 libxext6 ffmpeg libfontconfig1 libxrender1 libgl1-mesa-glx
      RUN apt -y --force-yes install libu2f-udev fonts-liberation libasound2  libatk-bridge2.0-0 libatk1.0-0  libatspi2.0-0  libcairo2 libcups2 libdrm2 libgbm1 libgtk-3-0 libnspr4 libnss3 libpango-1.0-0 libx11-6 libxcb1 libxcomposite1 libxdamage1 libxext6 libxfixes3 libxkbcommon0 libxrandr2 xdg-utils
      RUN apt -y --force-yes install wget
      RUN wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
      RUN dpkg -i google-chrome-stable_current_amd64.deb
      RUN apt install -f
© www.soinside.com 2019 - 2024. All rights reserved.