Ebay 抓取以避免包含“匹配较少单词的结果”的列表

问题描述 投票:0回答:1

我正在尝试抓取我从乐高中取出的产品的 eBay 销售列表,然后计算出这些列表的平均售价。

我遇到的问题是,它包含属于“结果匹配较少单词”的产品,而这些产品与我想看到的内容无关。

我试图在它到达该选择器时停止处理,但随后它会通过#N/As。

你能帮忙编辑这个以便在 python 中使用吗?

def fetch_ebay_average_price(ebay_search_url):
    """ Fetch the average sold price and count of sold listings from eBay sold listings page. """
    try:
        service = Service(CHROME_DRIVER_PATH)
        driver = webdriver.Chrome(service=service, options=chrome_options)
        driver.get(ebay_search_url)
        time.sleep(3)

        soup = BeautifulSoup(driver.page_source, 'html.parser')
        price_elements = soup.select('span.s-item__price')

        prices = []
        for price_element in price_elements:
            try:
                price_text = price_element.text.strip()
                price_value = float(re.sub(r'[£,$,€]', '', price_text.split()[0].replace(',', '')))
                prices.append(price_value)
            except Exception as e:
                print(f"Error processing price: {e}")
                continue

        driver.quit()
        if prices:
            average_price = round(statistics.mean(prices), 2)
            sold_count = len(prices)
            return f"£{average_price}", sold_count
        else:
            return "N/A", 0
    except Exception as e:
        print(f"Error fetching eBay average price: {e}")
        return "N/A", 0

这是乐高的一个示例产品,它在 eBay 上有售: https://www.lego.com/en-gb/product/stable-of-dream-creatures-71459 https://www.ebay.co.uk/sch/i.html?_nkw=乐高%20Stable%20of%C2%A0Dream%C2%A0Creatures%2071459&rt=nc&LH_Sold=1&LH_Complete=1

当我检查 eBay 页面时,我关注以下元素:

检查 eBay 页面时的

srp-river-answer--REWRITE_START”元素

python selenium-webdriver web-scraping beautifulsoup
1个回答
0
投票

最好先选择搜索结果列表中显示的所有项目。然后,您可以循环遍历这些商品,获取它们的价格,并在遇到分隔符“匹配较少单词的结果”时停止。

以下对我有用:

def fetch_ebay_average_price(url):
    options = webdriver.ChromeOptions()
    options.add_argument('--headless=new')

    try:
        driver = webdriver.Chrome(options=options)
        driver.get(url)
        
        soup = BeautifulSoup(driver.page_source, 'html.parser')

        items = soup.select('.srp-list li')
        for item in items:
            if 'srp-river-answer--REWRITE_START' in item.get('class', ''):
                notice = item.select_one('.section-notice__main')
                print(f'Found separator "{notice.text}". Stopping.')
                break
            name = item.select_one('.s-item__title')
            price = item.select_one('.s-item__price')
            if name and price:
                print(f'Found item "{name.text}" with price {price.text}')
    finally:
        driver.close()

它打印出一堆产品名称和价格,然后停止:

...
Found item "LEGO Dreamzzz: Stable of Dream Creatures (71459) BRAND NEW FACTORY SEALED" with price £28.55
Found item "LEGO DREAMZZZ: Stable of Dream Creatures (71459) 681 Pcs. New In Box" with price £36.16
Found item "LEGO STICKER SHEET for 71459 Stable of Dream Creatures, NEW & Genuine!" with price £6.65
Found separator "Results matching fewer words". Stopping.
© www.soinside.com 2019 - 2024. All rights reserved.