与此链接中提供的解决方案相同,我正在尝试获取一个词干词的所有叶词。我正在使用社区贡献的 (@Divyanshu Srivastava) 包 get_word_forms
my_list = [' jail', ' belief',' board',' target', ' challenge', ' command']
如果我手动工作,我会执行以下操作(逐字进行,如果我有 200 个单词的列表,这会非常耗时):
get_word_forms("command")
并得到以下输出:
{'n': {'command',
'commandant',
'commandants',
'commander',
'commanders',
'commandership',
'commanderships',
'commandment',
'commandments',
'commands'},
'a': set(),
'v': {'command', 'commanded', 'commanding', 'commands'},
'r': set()}
'n' 是名词,'a' 是形容词,'v' 是动词,'r' 是副词。如果我尝试一次性反转整个列表:
[get_word_forms(word) for word in sample]
我无法获得任何输出:
[{'n': set(), 'a': set(), 'v': set(), 'r': set()},
{'n': set(), 'a': set(), 'v': set(), 'r': set()},
{'n': set(), 'a': set(), 'v': set(), 'r': set()},
{'n': set(), 'a': set(), 'v': set(), 'r': set()},
{'n': set(), 'a': set(), 'v': set(), 'r': set()},
{'n': set(), 'a': set(), 'v': set(), 'r': set()},
{'n': set(), 'a': set(), 'v': set(), 'r': set()}]
我认为我未能将输出保存到字典中。最终,我希望我的输出是一个列表,而不将其分解为名词、形容词、副词或动词:类似:
['command','commandant','commandants', 'commander', 'commanders', 'commandership',
'commanderships','commandment', 'commandments', 'commands','commanded', 'commanding', 'commands', 'jail', 'jailer', 'jailers', 'jailor', 'jailors', 'jails', 'jailed', 'jailing'.....] .. and so on.
all_words = [setx for word in my_list for setx in get_word_forms(word.strip()).values() if len(setx)]
# Flatten the list of sets
all_words = [word for setx in all_words for word in setx]
# Remove the repetitions and sort the set
all_words = sorted(set(all_words))
print(all_words)
['belief', 'beliefs', 'believabilities', 'believability', 'believable', 'believably', 'believe', 'believed', 'believer', 'believers', 'believes', 'believing', 'board', 'boarded', 'boarder', 'boarders', 'boarding', 'boards', 'challenge', 'challengeable', 'challenged', 'challenger', 'challengers', 'challenges', 'challenging', 'command', 'commandant', 'commandants', 'commanded', 'commander', 'commanders', 'commandership', 'commanderships', 'commanding', 'commandment', 'commandments', 'commands', 'jail', 'jailed', 'jailer', 'jailers', 'jailing', 'jailor', 'jailors', 'jails', 'target', 'targeted', 'targeting', 'targets']