是Beautiful Soup的新功能,在下面的代码中一直坚持从“数据”中获取两个值。理想情况下,我想选择value1(500)作为“ item1”,第二个值(442)作为“ item2”。
<div id="chart-1" class="charts-highchart" data-chart="{"chart":{"type":"pie","width":null,"height":null,"backgroundColor"["Male","Female"],"data":[500,442]}],"exporting"pane":null}"
style=""></div>
使用正则表达式re
并使用以下css
选择器。
import re
from bs4 import BeautifulSoup
html='''<div id="chart-1" class="charts-highchart" data-chart="{"chart":{"type":"pie","width":null,"height":null,"backgroundColor"["Male","Female"],"data":[500,442]}],"exporting"pane":null}"
style=""></div>'''
soup=BeautifulSoup(html,'html.parser')
data=soup.select_one('#chart-1[data-chart]')['data-chart']
items=re.findall("(\d+)",data)
for item in items:
print(item)
输出:
500
442
如果要在变量中使用,请使用此。
import re
from bs4 import BeautifulSoup
html='''<div id="chart-1" class="charts-highchart" data-chart="{"chart":{"type":"pie","width":null,"height":null,"backgroundColor"["Male","Female"],"data":[500,442]}],"exporting"pane":null}"
style=""></div>'''
soup=BeautifulSoup(html,'html.parser')
data=soup.select_one('#chart-1[data-chart]')['data-chart']
items=re.findall("(\d+)",data)
item1=items[0]
item2=items[-1]
print(item1,item2)