我使用 Django 创建了一个网络应用程序。我想在这个网络应用程序中添加从内容中提取短语的功能。我的代码在开发中运行良好,但在生产中却无法运行。使用
nltk
包我创建了一个函数,它返回内容中的短语列表。下面是我的代码:
import nltk
from typing import List
def extract_phrases(content: str) -> List:
# Tokenize text into sentences
sentences = nltk.sent_tokenize(content)
# Initialize list to store phrases
phrases = set()
# Iterate through each sentence
for sentence in sentences:
# Tokenize sentence into words
words = nltk.word_tokenize(sentence)
# Get part-of-speech tags for words
pos_tags = nltk.pos_tag(words)
# Initialize list to store phrases in current sentence
current_phrase = []
# Iterate through each word and its part-of-speech tag
for word, pos_tag in pos_tags:
# If the word is a noun or an adjective, add it to the current phrase
if pos_tag.startswith("NN") or pos_tag.startswith("JJ"):
current_phrase.append(word)
# If the word is not a noun or adjective and the current phrase is not empty,
# add the current phrase to the list of phrases and reset it
elif current_phrase:
phrases.add(" ".join(current_phrase))
current_phrase = []
# Add the last phrase from the current sentence (if any)
if current_phrase:
phrases.add(" ".join(current_phrase))
return list(phrases)
功能和整个网络应用程序在开发环境中正常运行。但在使用 nltk 模块的生产中,该行未执行。
我已经激活了我的虚拟环境并运行了以下代码:
import nltk
nltk.download('all')
我们必须提供下载的 nltk 文件的路径。
import nltk
NLTK_PATH = 'path of files'
nltk.data.path.append(NLTK_PATH)