请提出任何建议
要浏览汤,您需要一个美丽的物体,而不是绳子。因此,取出您的get_text()
,您可以用等效的
raw.find_all('title', limit=1)
替换
find('title')
from urllib import request
url = "http://www.bbc.co.uk/news/election-us-2016-35791008"
html = request.urlopen(url).read().decode('utf8')
html[:60]
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
title = soup.find('title')
print(title) # Prints the tag
print(title.string) # Prints the tag string content
您可以直接使用“ soup.title”而不是“ soup.find_all('title',limit = 1)”或“ soup.find('title')”,它将给您标题。
from urllib import request
url = "http://www.bbc.co.uk/news/election-us-2016-35791008"
html = request.urlopen(url).read().decode('utf8')
html[:60]
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
title = soup.title
print(title)
print(title.string)
将其简单简单:
soup = BeautifulSoup(htmlString, 'html.parser')
title = soup.title.text
here,SOUP.TITLE
在某些页面中,我有非类型问题。一个建议是:
soup = BeautifulSoup(data, 'html.parser')
if (soup.title is not None):
title = soup.title.string
soup.title
或soup.title.string
或soup.title.text
soup.find('title').text