如何使用 BS4 抓取数据价值?

问题描述 投票:0回答:1

如何使用 BS4 抓取这些数据?我使用了

html.parser
但没有成功。

我的代码是:

for page in pages:
            
    page= cat[1] + "?s=%3Arelevance&page=" + str(page)
    page1 = requests.get(page)

    soup = BeautifulSoup(page1.content, "html.parser")
    data = [(re.find_all('div', attrs={'class':'prd'}), page1.text)]

    if not data:
        break

主数据:

<div id="product-item" class="prd " data-product-id="125034323" data-product-name="Lenovo IdeaPad 5 Intel Core i3-1115G4 4 GB 256 GB SSD Integrated Intel UHD Graphics 14&quot; FHD W11 Platinum Notebook Gri 82FE00LBTX" data-product-category="Bilgisayar ve Tablet" data-product-brand="Lenovo" data-product-price="5799.0" data-product-url="/lenovo-ideapad-5-intel-core-i31115g4-4-gb-256-gb-ssd-integrated-intel-uhd-graphics-14-fhd-w11-platinum-notebook-gri-82fe00lbtx-p-125034323" data-product-page-type="CATEGORY" data-product-position="1" data-product-subcategory="Laptop, Notebook" data-product-actual-price="0.0" data-product-discounted-price="5799.0" data-product-rating-score="" data-product-review-count="" data-product-occasion="N" data-product-photo-count="8" data-product-video="N" data-product-special="N" data-product-stock="Y" data-product-stock-status="Satışta" data-product-review="" data-product-variant="" data-category-name="Bilgisayar ve Tablet" data-facet-name="" data-facet-value="">
python web-scraping beautifulsoup
1个回答
2
投票

您可以根据需要调整代码

import requests
from bs4 import BeautifulSoup
response = requests.get("https://www.teknosa.com/bilgisayar-tablet-c-116")
soup = BeautifulSoup(response.text, "html.parser")
data = []
for prd in soup.find_all('div', attrs={'class': 'prd'}): # or soup.select(".prd")
    id = prd['data-product-id']
    photo_count = prd['data-product-photo-count']
    name = prd['data-product-name']
    discounted_price = prd['data-product-discounted-price']
    url = prd['data-product-url']
    data.append([id, photo_count, name, discounted_price, url])
© www.soinside.com 2019 - 2024. All rights reserved.