我如何在python中抓取web特殊单词

问题描述 投票:-1回答:1

我想要一个网页废话特别的话,我想如果'org'是'England'给我id ='name'和'England'我的代码是:

from bs4 import BeautifulSoup
import requests

r = requests.get('https://however.org/plus')
soup = BeautifulSoup(r.text, 'html.parser')
res = soup.find_all(id={'name', 'org'})

for item in res:
    print(item.text.strip())

所以,我的输出将是这样的:

General English: Intermediate <====== (name)
american   <======= (org)
General English: Elementary
England    <=======
General English: Intermediate Plus
England
General English: Beginner
american
TOEFL iBT: Listening and Speaking
England
TOEFL iBT: Reading
american
Grammar for IELTS
american

但我想只要org是英格兰给我的名字而不想要美国组织和那个名字我想要这个输出;

General English: Elementary
England 
General English: Intermediate Plus
England
TOEFL iBT: Listening and Speaking
England

我怎样才能解决这个问题?如果org是'England',我想要一个带有org的打印名称

python web-scraping
1个回答
0
投票

如果你担心的只是输出,那么for循环的这个修改就足够了:

for item in res:
   if 'england' in item.text.lower():
   #if 'england' == item.get('org').lower():
       print(item.get('name'), item.get('org'))
       #print('Name: ', item.get('name'), ', Org:', item.get('org'))
© www.soinside.com 2019 - 2024. All rights reserved.