ValueError:无法设置列不匹配的行--beautifulSoup

问题描述 投票:0回答:1

从维基百科抓取时,我收到“ValueError:无法设置列不匹配的行错误”。见下文。我该如何解决这个问题?

from bs4 import BeautifulSoup
import pandas as pd
import requests
url = 'https://en.wikipedia.org/wiki/List_of_largest_companies_by_revenue'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html')
table = soup.find_all('table')[0]
soup.find('tr')
world_companies = soup.find('tr')
df = pd.DataFrame(columns = world_table_companies)
df
table.find_all('tr')
column_data = table.find_all('tr')
    for row in column_data[2:]:
    row_data = row.find_all('td')
    individual_row_data = [data.text.strip() for data in row_data]    
    length = len(df)
    df.loc[length] = individual_row_data

ValueError: cannot set a row with mismatched columns
python beautifulsoup jupyter-notebook
1个回答
0
投票

你不需要漂亮的汤来毁掉一张有熊猫的桌子:

import pandas as pd

table_MN = pd.read_html('https://en.wikipedia.org/wiki/List_of_largest_companies_by_revenue')
for df in table_MN:
    if "Rank" in df.columns:
        print(df.to_string(index=False))
© www.soinside.com 2019 - 2024. All rights reserved.