从网页Python中刮取多个表格

问题描述 投票:0回答:1

我正试图从下面的网页上抓取多个表格。但是,我的代码只获得第一个表,即使所有表都嵌套在相同的tr和td标记中。这是我的尝试:

 url = "http://zipnet.in/index.php?page=missing_person_search&criteria=browse_all&Page_No=1"
 r = requests.get(url)
 soup = BeautifulSoup(r.content, 'html.parser')
 tables = soup.find('table', border=1)
 for row in tables.findAll('tr'):
 sleep (3)
 col = row.findAll('td')
 fields = col[0].string
 details = col[1].string
 record = (fields, details)
 print (record)

我在这里错过了什么?

python web-scraping beautifulsoup python-requests
1个回答
0
投票

试一试,获取该页面中的所有表格,尤其是包含所需记录的表格:

import requests 
from bs4 import BeautifulSoup

url = "http://zipnet.in/index.php?page=missing_person_search&criteria=browse_all&Page_No=1"
res = requests.get(url)
soup = BeautifulSoup(res.text, 'lxml')
for trow in soup.select("table#AutoNumber15"):
    data = [[' '.join(item.text.split()) for item in tcel.select("td")]
            for tcel in trow.select("tr")]
    print(data)
© www.soinside.com 2019 - 2024. All rights reserved.