这是我到目前为止的代码
for page in range(1, 5):
guitarPage
=requests.get('https://www.guitarguitar.co.uk/guitars/electric/page-'.format(page)).text
soup = BeautifulSoup(guitarPage, 'lxml')
# row = soup.find(class_='row products flex-row')
guitars = soup.find_all(class_='col-xs-6 col-sm-4 col-md-4 col-lg-3')
这是迭代产品的实际循环
for guitar in guitars:
title_text = guitar.h3.text.strip()
print('Guitar Name: ', title_text)
price = guitar.find(class_='price bold small').text.strip()
print('Guitar Price: ', price)
time.sleep(0.5)
到目前为止,代码只能在同一页面上运行,而无需转到下一页。网站的URL结构适用于第2页,第3页++等。
您必须在链接中添加{}。我还添加了时间模块。
import requests
from bs4 import BeautifulSoup
import time
for page in range(1, 5):
guitarPage = requests.get('https://www.guitarguitar.co.uk/guitars/electric/page-{}'.format(page)).text
soup = BeautifulSoup(guitarPage, 'lxml')
# row = soup.find(class_='row products flex-row')
guitars = soup.find_all(class_='col-xs-6 col-sm-4 col-md-4 col-lg-3')
for guitar in guitars:
title_text = guitar.h3.text.strip()
price = guitar.find(class_='price bold small').text.strip()
print('Guitar Name: ', title_text, 'Guitar Price: ', price)
time.sleep(0.5)