如何获得与一个单词相关的类似单词?

问题描述 投票:0回答:1

我试图解决一个nlp问题,我有一个单词的词典,如:

list_1={'phone':'android','chair':'netflit','charger':'macbook','laptop','sony'}

现在,如果输入是“电话”,我可以轻松地使用“输入”操作员按键来获取电话及其数据的描述,但问题是输入是“电话”还是“电话”。

我想如果我输入'手机'然后我得到像这样的话

'phone' ==> 'Phones','phones','Phone','Phone's','phone's' 

我不知道哪个word2vec可以使用,哪个nlp模块可以提供这样的解决方案。

第二个问题是,如果我说“狗”,我可以得到像'小狗','小猫','狗','狗'等字样?

我试过这样的东西,但它给出了同义词:

from nltk.corpus import wordnet as wn
for ss in wn.synsets('phone'): # Each synset represents a diff concept.
    print(ss)

但它的回归:

Synset('telephone.n.01')
Synset('phone.n.02')
Synset('earphone.n.01')
Synset('call.v.03')

相反,我想:

'phone' ==> 'Phones','phones','Phone','Phone's','phone's' 
python nlp nltk gensim spacy
1个回答
4
投票

WordNet索引概念(又名Synsets)而不是单词。

使用lemma_names()访问WordNet中的根词(aka Lemma)。

>>> from nltk.corpus import wordnet as wn
>>> for ss in wn.synsets('phone'): # Each synset represents a diff concept.
...     print(ss.lemma_names())
... 
['telephone', 'phone', 'telephone_set']
['phone', 'speech_sound', 'sound']
['earphone', 'earpiece', 'headphone', 'phone']
['call', 'telephone', 'call_up', 'phone', 'ring']

作为根形式或单词的引理不应该有其他词缀,因此您不会找到您在所需单词列表中列出的复数或不同形式的单词。

也可以看看:

此外,单词含糊不清,可能需要通过上下文或我的词性(POS)消除歧义,然后才能得到“相似”的单词,例如,您看到动词含义中的“phone”并不完全相同电话和“名词”一样。

>>> for ss in wn.synsets('phone'): # Each synset represents a diff concept.
...     print(ss.lemma_names(), '\t', ss.definition())
... 
['telephone', 'phone', 'telephone_set']      electronic equipment that converts sound into electrical signals that can be transmitted over distances and then converts received signals back into sounds
['phone', 'speech_sound', 'sound']   (phonetics) an individual sound unit of speech without concern as to whether or not it is a phoneme of some language
['earphone', 'earpiece', 'headphone', 'phone']   electro-acoustic transducer for converting electric signals into sounds; it is held over or inserted into the ear
['call', 'telephone', 'call_up', 'phone', 'ring']    get or try to get into communication (with someone) by telephone
© www.soinside.com 2019 - 2024. All rights reserved.