试图抓到下一个网页

问题描述 投票:0回答:1

这是我到目前为止的代码

for page in range(1, 5):
    guitarPage 
   =requests.get('https://www.guitarguitar.co.uk/guitars/electric/page-'.format(page)).text
    soup = BeautifulSoup(guitarPage, 'lxml')
    # row = soup.find(class_='row products flex-row')
    guitars = soup.find_all(class_='col-xs-6 col-sm-4 col-md-4 col-lg-3')

这是迭代产品的实际循环

    for guitar in guitars:
        title_text = guitar.h3.text.strip()
        print('Guitar Name: ', title_text)
        price = guitar.find(class_='price bold small').text.strip()
        print('Guitar Price: ', price)
        time.sleep(0.5)

到目前为止,代码只能在同一页面上运行,而无需转到下一页。网站的URL结构适用于第2页,第3页++等。

loops for-loop web-scraping beautifulsoup iterator
1个回答
0
投票

您必须在链接中添加{}。我还添加了时间模块。

      import requests
      from bs4 import BeautifulSoup
      import time

      for page in range(1, 5):
          guitarPage = requests.get('https://www.guitarguitar.co.uk/guitars/electric/page-{}'.format(page)).text
          soup = BeautifulSoup(guitarPage, 'lxml')
          # row = soup.find(class_='row products flex-row')
          guitars = soup.find_all(class_='col-xs-6 col-sm-4 col-md-4 col-lg-3')
          for guitar in guitars:
              title_text = guitar.h3.text.strip()
              price = guitar.find(class_='price bold small').text.strip()
              print('Guitar Name: ', title_text, 'Guitar Price: ', price)
              time.sleep(0.5)
© www.soinside.com 2019 - 2024. All rights reserved.