我修改了以下代码中获取最后一个“中心”td 的方式:
import urllib
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(urllib.urlopen("http://www.law.com/special/professionals/amlaw/amlaw200/amlaw200_ppp.html"))
rows = soup.findAll(name='tr',attrs={'valign':'bottom'}, limit=13)
for row in rows:
tds_left = row.findAll(name='td',attrs={'align':'left'}, limit=13)
tds_center = row.findAll(name='td',attrs={'align':'center'}, limit=13)
if tds_left:
firm_name = tds_left[0].text
print firm_name
if tds_center:
# get last td "center"
ppp = tds_center[-1].text
print ppp
并得到以下结果:
Firm
Profits PerPartner
Wachtell, Lipton
$3,385,000
Robins, Kaplan
$3,055,000
Cravath
$2,110,000
Sullivan & Cromwell
$1,790,000
Cahill Gordon
$1,710,000
Simpson Thacher
$1,655,000
Davis Polk
$1,610,000
回溯与代码不对应。
回溯:
ppp = tds_center[2].text
您的代码:
ppp = tds_center[0].text
代码的结果输出有效,但看起来不是很有趣,John Keyes 有更有趣的输出,但具有 [-1] 值。 这取决于您的需求。