尝试使用 python 下载网页并单击 docker 中带有 selenium 的按钮时出现问题

问题描述 投票:0回答:1

我无法让它在我的一生中正常工作。我正在尝试加载网页并单击其上的按钮,但无法使其工作。 Selenium 要么抱怨,不加载,抱怨它无法进行会话,抱怨它没有适当的选项,要么永远加载,要么直接不起作用。

Dockerfile

FROM python:3.11-slim-buster

USER root

# Create a non-root user
RUN useradd -ms /bin/bash appuser
WORKDIR /app
RUN chown appuser:appuser /app

USER appuser

COPY requirements.txt .
RUN pip install -r requirements.txt

# Copy   
COPY src .

# Expose the application port (e.g., 5000)
EXPOSE 5000

# Define the command to run the application
CMD ["python3", "app.py"]

Docker-compose.yml

version: '3.8'

services:
  chrome:
    image: selenium/node-chrome:3.14.0-gallium
    volumes:
      - /dev/shm:/dev/shm
    depends_on:
      - hub
    environment:
      HUB_HOST: hub
  hub:
    image: selenium/hub:3.14.0-gallium
    ports:
      - "4444:4444"

  web:
    build: .
    depends_on:
      - hub
    volumes:
      - ./src:/app
    ports:
      - "5000:5000"

应用程序.py

from flask import Flask, render_template, request
import requests
import re
import os
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
import urllib.parse
from selenium.webdriver.chrome.options import Options
def download_page(url):
    chrome_options = Options()
    chrome_options.add_argument('--headless')
    chrome_options.page_load_strategy = 'normal'
    chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
    chrome_options.add_experimental_option('useAutomationExtension', False)
    chrome_options.add_argument('--no-sandbox')
    chrome_options.add_argument('--lang=en')
    chrome_options.add_argument('--ignore-certificate-errors')
    chrome_options.add_argument('--allow-running-insecure-content')
    chrome_options.add_argument('--disable-notifications')
    chrome_options.add_argument('--disable-dev-shm-usage')
    chrome_options.add_argument('--disable-browser-side-navigation')
    chrome_options.add_argument('--mute-audio')
    chrome_options.add_argument('--force-device-scale-factor=1')
    chrome_options.add_argument('window-size=1080x760')
    driver = webdriver.Remote('http://hub:4444/wd/hub')

    driver.get(url)
    //Process page or click buttons

app = Flask(__name__)

@app.route('/')
def index():
    return render_template('index.html')

@app.route('/process', methods=['POST'])
def process():
    url = request.form['url'] 

    download_page(url)
    return "URL processing complete!"

if __name__ == '__main__':
    app.run(host='0.0.0.0',debug=True)

index.html

<!DOCTYPE html>
<html>
<head>
    <title>URL Processor</title>
</head>
<body>
    <h1>Enter a URL to process:</h1>
    <form method="POST" action="/process">
        <input type="text" name="url" placeholder="Enter URL here">
        <button type="submit">Process URL</button>
    </form>
</body>
</html>

我尝试过使用selenium/standalone-chrome作为docker基础,但它不允许pip安装flask,因为它是“受控的外部”

我尝试将其加载到外部,但它抱怨无法进行会话。会话未创建异常

我尝试在内部加载它,但它抱怨找不到 chrome 驱动程序,当我尝试安装时它只是挂起。没有错误。什么都没有坐在那里。

如果我只是将它作为独立运行而不使用烧瓶,它工作得非常好。只是当我试图将其包装到 docker 文件中时,它每次都会阻止我。硒的文档已经过时也无济于事。

python docker selenium-webdriver flask
1个回答
0
投票

您正在创建 chrome 选项,但没有将它们传递给 WebDriver。

当我向其中添加选项时,它对我来说效果很好。

更改此行:

driver = webdriver.Remote('http://hub:4444/wd/hub')

driver = webdriver.Remote('http://hub:4444/wd/hub', options=chrome_options)
© www.soinside.com 2019 - 2024. All rights reserved.