有以下网站的截图:news ge
我想提取注释,正如你所看到的,它位于带有 c_comment 类的 div 标签下,所以我实现了以下代码:
import requests
from bs4 import BeautifulSoup
import string
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'
}
url ='https://www.ambebi.ge/article/318245-chazirvamde-ramdenime-saatit-adre-titanikidan-bri/'
content=requests.get(url,headers=headers)
content =BeautifulSoup(content.text,'html.parser')
print(content.find_all("div",class_='c_comment'))
但它只返回 [] 空列表,我该如何解决这个问题?
下面的代码可以打印html内容吗?
import requests
from bs4 import BeautifulSoup
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'
}
url = 'https://www.ambebi.ge/article/318245-chazirvamde-ramdenime-saatit-adre-titanikidan-bri/'
content = requests.get(url, headers=headers)
soup = BeautifulSoup(content.text, 'html.parser')
# Print the entire HTML
print(soup.prettify()) # This will print out the entire HTML of the page
也许在加载整个 HTML 代码后,可以通过 javascript 代码挂载 div 部分。