为什么我会收到“IndexError:列表索引超出范围”? (美汤)

问题描述 投票:0回答:2
python web-scraping beautifulsoup
2个回答
1
投票

我修改了以下代码中获取最后一个“中心”td 的方式:

import urllib
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(urllib.urlopen("http://www.law.com/special/professionals/amlaw/amlaw200/amlaw200_ppp.html"))
rows = soup.findAll(name='tr',attrs={'valign':'bottom'}, limit=13)
for row in rows:
    tds_left = row.findAll(name='td',attrs={'align':'left'}, limit=13)
    tds_center = row.findAll(name='td',attrs={'align':'center'}, limit=13)
    if tds_left:
        firm_name = tds_left[0].text
        print firm_name
    if tds_center:
        # get last td "center"
        ppp = tds_center[-1].text
        print ppp

并得到以下结果:

Firm
Profits PerPartner
Wachtell, Lipton
$3,385,000
Robins, Kaplan
$3,055,000
Cravath
$2,110,000
Sullivan & Cromwell
$1,790,000
Cahill Gordon
$1,710,000
Simpson Thacher
$1,655,000
Davis Polk
$1,610,000

0
投票

回溯与代码不对应。

回溯:

ppp = tds_center[2].text

您的代码:

ppp = tds_center[0].text

代码的结果输出有效,但看起来不是很有趣,John Keyes 有更有趣的输出,但具有 [-1] 值。 这取决于您的需求。

© www.soinside.com 2019 - 2024. All rights reserved.