如何在 tag?中提取rel中的内容

问题描述 投票:0回答:1
<a href="#" class="tip" rel="&nbsp;
    Principal Name - S. BALKAR SINGH
    Mobile No. - 8146611008
    Email ID - [email protected]
    &nbsp;" style="user-select: text;">View Contact Details<span 
class="caret"></span></a>

主要名称,手机号码和电子邮件ID是我感兴趣的内容。当我指定soup.find('a', {'class':'tip'})时,它只给我“查看联系人详细信息”。

有没有办法在rel中提取内容?

python web-scraping beautifulsoup rel
1个回答
0
投票

rel是属性所以你必须使用['rel'] - 即。 soup.find('a', {'class':'tip'})['rel']

工作实例

data = '''<a href="#" class="tip" rel="&nbsp;
    Principal Name - S. BALKAR SINGH
    Mobile No. - 8146611008
    Email ID - [email protected]
    &nbsp;" style="user-select: text;">View Contact Details<span 
class="caret"></span></a>'''

from bs4 import BeautifulSoup

soup = BeautifulSoup(data, 'html.parser')

item = soup.find('a', {'class':'tip'})

print('text:', item.text)
print(' rel:', item['rel'])
print(' rel:', ' '.join(item['rel']))

结果:

text: View Contact Details
 rel: ['', 'Principal', 'Name', '-', 'S.', 'BALKAR', 'SINGH', 'Mobile', 'No.', '-', '8146611008', 'Email', 'ID', '-', '[email protected]', '']
 rel:  Principal Name - S. BALKAR SINGH Mobile No. - 8146611008 Email ID - [email protected] 

BSrel返回列表,而不是一个字符串,因为Multi-valued attributes


编辑:要获取数据表,你必须发送POST请求与通常发送浏览器到服务器的所有数据 - 它表示数据形式,它甚至可以是空字符串,但服务器必须接收表单字段。

import requests
from bs4 import BeautifulSoup

headers = {'User-Agent': 'Mozilla/5.0'}

# form fields send to server
params = {
    'SchoolType': '',
    'Dist1': '',    
    'Sch1': '', 
    'SearchString': ''  
}

r = requests.post('http://www.registration.pseb.ac.in/School/Schoollist', headers=headers, data=params)

soup = BeautifulSoup(r.text, 'html.parser')

all_a = soup.find_all('a', {'class':'tip'})

for items in all_a:
    print('text:', item.text)
    print(' rel:', item['rel'])
    print(' rel:', ' '.join(item['rel']))
    print('-----')

结果:

text: View Contact Details
 rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', '[email protected]', '']
 rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - [email protected] 
-----
text: View Contact Details
 rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', '[email protected]', '']
 rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - [email protected] 
-----
text: View Contact Details
 rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', '[email protected]', '']
 rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - [email protected] 
-----
text: View Contact Details
 rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', '[email protected]', '']
 rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - [email protected] 
-----
text: View Contact Details
 rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', '[email protected]', '']
 rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - [email protected] 
-----
text: View Contact Details
 rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', '[email protected]', '']
 rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - [email protected] 
-----
text: View Contact Details
 rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', '[email protected]', '']
 rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - [email protected] 
-----
text: View Contact Details
 rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', '[email protected]', '']
 rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - [email protected] 
-----
text: View Contact Details
 rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', '[email protected]', '']
 rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - [email protected] 
-----
text: View Contact Details
 rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', '[email protected]', '']
 rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - [email protected] 
-----
© www.soinside.com 2019 - 2024. All rights reserved.