becomes empty, when I'm trying to get it via BeautifulSoup

问题描述 投票:1回答:1

我正在尝试解析网站https://www.kp.ru/best/kazan/abiturient_2018/ivmit/的表格。 Chrome的DevTools向我展示该表是:

<div class="t431__table-wapper" data-auto-correct-mobile-width="false"> 
<table class="t431__table " style="">
...
</table>
</div>

但是当我这样做时:

url = r"https://www.kp.ru/best/kazan/abiturient_2018/ivmit/"
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
tag = soup.find_all('div', {'class':r't431__table-wapper'})
print(tag)

它返回我像<table>是空的:

[<div class="t431__table-wapper" data-auto-correct-mobile-width="false">
<table class="t431__table" style=""></table></div>, 
<div class="t431__table-wapper" data-auto-correct-mobile-width="false">
<table class="t431__table" style=""></table></div>,
<div class="t431__table-wapper" data-auto-correct-mobile-width="false">
<table class="t431__table" style=""></table></div>,
<div class="t431__table-wapper" data-auto-correct-mobile-width="false">
<table class="t431__table" style=""></table></div>]

是JavaScript还是其他什么?如何解决这个问题?

python parsing web-scraping beautifulsoup screen-scraping
1个回答
1
投票

您可以从另一个标签获取该信息

import requests
from bs4 import BeautifulSoup as bs

url = 'https://www.kp.ru/best/kazan/abiturient_2018/ivmit/'
soup = bs(requests.get(url).content, 'lxml')
print(soup.select_one('.t431__data-part2').text)

输出:

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.