刮刮亚马逊产品名称

问题描述 投票:1回答:1

我正在尝试根据卖家名称收集亚马逊上的前两页产品名称。当我请求页面时,它具有我需要的所有元素,但是,当我使用BeautifulSoup时 - 它们没有被列出。这是我的代码:

import requests
from bs4 import BeautifulSoup
headers = {'User-Agent':'Mozilla/5.0'}
res = requests.get("https://www.amazon.com/s?me=A3WE363L17WQR&marketplaceID=ATVPDKIKX0DER", headers=headers)
#print(res.text)
soup = BeautifulSoup(res.text, "html.parser")
soup.find_all("a",href=True)

未列出产品链接。如果Amazon API提供此信息,我愿意使用它(请提供一些使用示例)。非常感谢提前。

python web-scraping beautifulsoup
1个回答
0
投票

我从alt属性中提取了产品名称。这是按预期的吗?

import requests
from bs4 import BeautifulSoup as bs

r = requests.get('https://www.amazon.com/s?me=A3WE363L17WQR&marketplaceID=ATVPDKIKX0DER')
soup = bs(r.content, 'lxml')
items = [item['alt'] for item in soup.select('.a-link-normal [alt]')]
print(items)

超过两页:

import requests
from bs4 import BeautifulSoup as bs
url = 'https://www.amazon.com/s?i=merchant-items&me=A3WE363L17WQR&page={}&marketplaceID=ATVPDKIKX0DER&qid=1553116056&ref=sr_pg_{}'
for page in range(1,3):
    r = requests.get(url.format(page,page))
    soup = bs(r.content, 'lxml')
    items = [item['alt'] for item in soup.select('.a-link-normal [alt]')]
    print(items)
© www.soinside.com 2019 - 2024. All rights reserved.