我试图从div标签中提取文本
我的代码:
import requests
from bs4 import BeautifulSoup
url='url'
page = requests.get(url,'lxml')
soup = BeautifulSoup(page.content)
print(soup.find('div',{'class':'meta-
item salary'}).text)
HTML代码:
<div class="meta-item salary">
<span
class="icon icon-pound-currency-3"></span> $1000 - $2000 per annum + + excellent benefits </div>
结果我有:“优秀包”只有没有数值。我一无所知。
您可以使用
soup.select_one('.icon-pound-currency-3').text
或者以下,使用化合物中的单个类
html = '''
<div class="meta-item salary">
<span
class="icon icon-pound-currency-3"></span> $1000 - $2000 per annum + + excellent benefits </div>
'''
soup = BeautifulSoup(html, 'lxml')
print(soup.find('div',{'class':'salary'}).text)
如果可以有多个匹配,则需要findAll或soup.select并迭代返回的列表。位置匹配可能是可能的,但需要查看页面html。