使用 div 标签进行网页抓取

问题描述 投票:0回答:1

有以下网站的截图:news ge

enter image description here

我想提取注释,正如你所看到的,它位于带有 c_comment 类的 div 标签下,所以我实现了以下代码:

import requests
from bs4 import BeautifulSoup
import string
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'
}
url ='https://www.ambebi.ge/article/318245-chazirvamde-ramdenime-saatit-adre-titanikidan-bri/'
content=requests.get(url,headers=headers)
content =BeautifulSoup(content.text,'html.parser')
print(content.find_all("div",class_='c_comment'))

但它只返回 [] 空列表,我该如何解决这个问题?

python web-scraping beautifulsoup
1个回答
0
投票

下面的代码可以打印html内容吗?

import requests  
from bs4 import BeautifulSoup  

headers = {  
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'  
}  
url = 'https://www.ambebi.ge/article/318245-chazirvamde-ramdenime-saatit-adre-titanikidan-bri/'  
content = requests.get(url, headers=headers)  
soup = BeautifulSoup(content.text, 'html.parser')  

# Print the entire HTML  
print(soup.prettify())  # This will print out the entire HTML of the page  

也许在加载整个 HTML 代码后,可以通过 javascript 代码挂载 div 部分。

© www.soinside.com 2019 - 2024. All rights reserved.