Beautiful Soup“.find”无法从 Windows 终端运行

Question

我正在尝试自动化一个程序，定期从亚马逊和其他页面上删除一些价格。（我从亚马逊开始）

问题是当我使用 PyCharm 执行 soup.find 方法时，它找到他的目标并正确返回它，并且使用 windows 终端它返回：None

我的代码在 PyCharm 中运行良好，但我需要它从 Windows 终端运行以通过 .bat 文件实现自动化。

我发现这是一个非常奇怪的问题，我找不到有关它的文档，所以如果你们中的任何人可以帮助我解决它那就太棒了！

有些东西我已经尝试过了，所以就被丢弃了。

卸载并重新安装 bs4
验证所有所需模块的安装
验证 Windows 是否在与 PyCharm 相同的文件夹中运行程序
指出这个问题并不总是发生（我构建了一个日志报告，它向我显示，从 12 个网络中，它在 10、11 或 12 处失败（也不是固定数字），这只发生在从终端运行时）
两种情况的反应都是

我将它与 PyCharm 和 Windows 获得的汤进行了比较，并且是不同的汤，在 Windows 中我无法手动找到文本单词。

最后我把我正在使用的代码放在这里，这样你就可以看到我所看到的：

因此，您可以重现错误，该错误是始终出现此错误的链接之一：https://www.amazon.es/dp/B0BRYY69YD

import time
import requests
from bs4 import BeautifulSoup
import pandas as pd
import os
from csv import writer
from datetime import date, datetime
from tqdm import tqdm

def r_Amazon(URL):
    headers = {
        'Accept-Encoding': 'gzip, deflate, br',
        'Accept-Language': 'es-ES,es;q=0.8',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36'
    }
    response = requests.get(URL, headers=headers)
    soup = BeautifulSoup(response.content, 'lxml')
    # Comprobar si el Item esta en Stock o Solo Segunda Mano
    get_error = 0
    try:
        product_price_stat = soup.find('span', {'class': 'a-text-bold'}).text.strip() #<- HERE IT FAILS

        if product_price_stat == 'Comprar de segunda mano' or product_price_stat == 'Ofertas destacadas no disponibles':
            # El item tiene el precio de 2a MANO, utilizar script correspondiente
            try:
                product_price = 'ND'
                get_error = 1
            except:
                print('ERROR 2nd TRY')
                get_error = 1
        else:
            # El item tiene el precio NORMAL, utilizar script correspondiente
            try:
                product_price = soup.find('span', {'class': 'a-offscreen'}).text.strip()
                # Format Correctly
                product_price = product_price.replace('.', '')
                product_price = product_price.replace(',', '.')
                product_price = product_price.replace('€', '')
            except:
                print('ERROR 1rs TRY')
                get_error = 1
    except:
        product_price = 'ND'
        print('ERROR')
        get_error = 1
    return product_price, get_error

Answer 1

这既不是代码或终端的问题，只是亚马逊不让你做抢购，因为它认为你是一个机器人（是的，即使你大部分时间使用 Header amazom 都可以检测到它）。

如果你尝试在函数中打印汤（在出错时你会得到这个）

输入您在下面看到的字符
抱歉，我们只是需要确保您不是机器人。为了获得最佳效果，请确保您的浏览器接受 cookie。

我发现自己过去也遇到过同样的麻烦，我建议你使用 Selenium 来获取网页内容而不是请求。

这就是你可以做到的

import time
# import requests
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
import pandas as pd
import os
from csv import writer
from datetime import date, datetime
from tqdm import tqdm

def r_Amazon(URL):
    chrome_options = Options()
    chrome_options.add_argument('--headless') # using this so that the browser will open in background
    driver = webdriver.Chrome(options=chrome_options)
    driver.get(URL)
    soup = BeautifulSoup(driver.page_source, 'lxml')
    driver.quit()
    # Comprobar si el Item esta en Stock o Solo Segunda Mano
    get_error = 0
    try:
        product_price_stat = soup.find('span', {'class': 'a-text-bold'}).text.strip() #<- HERE IT FAILS

        if product_price_stat == 'Comprar de segunda mano' or product_price_stat == 'Ofertas destacadas no disponibles':
            # El item tiene el precio de 2a MANO, utilizar script correspondiente
            try:
                product_price = 'ND'
                get_error = 1
            except:
                print('ERROR 2nd TRY')
                get_error = 1
        else:
            # El item tiene el precio NORMAL, utilizar script correspondiente
            try:
                
                product_price = soup.find('span', {'class': 'a-offscreen'}).text.strip()
                # Format Correctly
                product_price = product_price.replace('.', '')
                product_price = product_price.replace(',', '.')
                product_price = product_price.replace('€', '')
            except:
                print('ERROR 1rs TRY')
                get_error = 1
    except:
        product_price = 'ND'
        print('ERROR')
        get_error = 1
    os.system("cls" if os.name == 'nt' else "clear") # clear you screen before returning the output
    return product_price, get_error

确保使用

pip install selenium

安装 selenium

Beautiful Soup“.find”无法从 Windows 终端运行

问题描述投票：0回答：1

1个回答

输入您在下面看到的字符

最新问题

Beautiful Soup“.find”无法从 Windows 终端运行

问题描述 投票：0回答：1

1个回答

输入您在下面看到的字符

最新问题

问题描述投票：0回答：1