FindNextSibling()函数无法正常工作

问题描述 投票:0回答:1

我尝试了以下代码,此功能不起作用,给我一个错误。

“AttributeError:'NoneType'对象没有属性'findNextSiblings'”

我该怎么办才能解决这个错误?

我尝试删除h_spanw_span变量并在循环中调用soup.findNextSibling函数而不是h_span.findNextSibling,它只返回一个空字符串,代码确实有效。

from selenium import webdriver
from bs4 import BeautifulSoup
import requests
import os

driver = webdriver.Chrome(executable_path= r'E:/Summer/FirstThings/Web scraping (bucky + pdf)/webscraping/tutorials-master/chromedriver.exe')
url = 'https://www.nba.com/players/aron/baynes/203382'
driver.get(url)

soup = BeautifulSoup(driver.page_source , 'lxml')

height = ''
h_span = soup.find('p', string = 'HEIGHT')
for span in h_span.findNextSiblings():
    height = height + span.text

weight = ''
w_span = soup.find('p', string = 'WEIGHT')
for span in w_span.findNextSiblings():
    weight = weight + span.text

born = ''
b_span = soup.find('p', string = 'BORN')
for span in b_span.findNextSiblings():
    born = born + span.text


print(height)
print("")
print(weight)
print("")
print(born)


driver.__exit__()

它应该以标题本身的文本格式返回玩家身高和体重信息。

python web-scraping pycharm
1个回答
1
投票

我喜欢使用体育数据!

你在这里做的工作太多了。无需使用Selenium或BeautifulSoup来解析html,因为nba.com以漂亮的json格式提供此数据。您需要做的就是找到您需要的玩家并提取您想要的数据:

from bs4 import BeautifulSoup     
import requests

url = 'https://data.nba.net/prod/v1/2018/players.json'

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'}


jsonData = requests.get(url).json()

find_player = 'Baynes'

for player in jsonData['league']['standard']:
    if player['lastName'] == find_player:
        name = player['firstName'] + ' ' + player['lastName']
        height = player['heightFeet']  + 'ft ' + player['heightInches'] + 'in'
        weight = player['weightPounds'] + 'lbs'
        born = player['dateOfBirthUTC']

        print ('%s\n%s\n%s\n%s\n' %(name, height, weight, born))

输出:

Aron Baynes
6ft 10in
260lbs
1986-12-09
© www.soinside.com 2019 - 2024. All rights reserved.