unstructured.document.html 中出现 ModuleNotFound 错误

问题描述 投票:0回答:1

我正在执行这段代码

from unstructured.documents.html import HTMLDocument

# Load your HTML file
html_file_path = 'UBER_2019.html'
doc = HTMLDocument.from_file(html_file_path)

# Extract text
text = doc.text

我收到一个错误,即

ModuleNotFoundError                       Traceback (most recent call last)
Cell In[3], line 1
----> 1 from unstructured.documents.html import HTMLDocument
      3 # Load your HTML file
      4 html_file_path = 'UBER_2019.html'

ModuleNotFoundError: No module named 'unstructured.documents.html'

那么我可以做什么来解决这个问题

python machine-learning deep-learning nlp large-language-model
1个回答
0
投票

您需要安装非结构化模块。

pip install unstructured

https://pypi.org/project/unstructed/

然后尝试:

from unstructured.documents.html import HTMLDocument
© www.soinside.com 2019 - 2024. All rights reserved.