网页抓取 <td>标签问题 - Python 3 With Lxml

问题描述投票：0回答：1

使用 lxml 库在 python 中进行网页抓取。我的代码当前输出一个空列表：

from lxml import html
import requests

page = requests.get('www.example.com')
tree = html.fromstring(page.content)

print(tree)

python xpath web-scraping lxml

1个回答

1
投票

HTML！= XML。一些 html5 标签可能会扰乱 XML 解析器。

尝试使用 BeautifulSoup 将解析器设置为

html5lib

。

最新问题

© www.soinside.com 2019 - 2024. All rights reserved.