nltk:如何搜索某些单词之间的联系?

问题描述 投票:3回答:1

我正在使用nltk和wordnet链接属于某些关系组的单词。例如,“停车”和“建筑”应该有一些父联系。我使用上位词但是对于某些词语没有连接。

x = wordnet.synset('parking.n.01')
y = wordnet.synset('building.n.01')

print(x._shortest_hypernym_paths(y))
print(y._shortest_hypernym_paths(x))

{Synset('parking.n.01'):0,Synset('room.n.02'):1,Synset('position.n.07'):2,Synset('relation.n.01') :3,Synset('abstraction.n.06'):4,Synset('entity.n.01'):5,Synset('ROOT'):6} {Synset('building.n.01'): 0,Synset('structure.n.01'):1,Synset('artifact.n.01'):2,Synset('whole.n.02'):3,Synset('object.n.01') ):4,Synset('physical_entity.n.01'):5,Synset('entity.n.01'):6,Synset('ROOT'):7}

在这里,连接通过'entity.n.01',它实际上是几乎所有物理名词的根。我怎样才能得到比这更好的东西?

我想得到像'停车' - >'结构' - >'建筑'这样的东西;它可以更长,但“外星人”的单词不应该在那里,例如'monkey'也会拉到实体。

python nltk wordnet
1个回答
1
投票

找到一些有用的方式来查看可能性:

def getShortestHypernymPath(word1, word2, nulls=False):
    syns1 = wordnet.synsets(word1)
    syns2 = wordnet.synsets(word2)
    for s1 in syns1:
        for s2 in syns2:
            lch = s2.lowest_common_hypernyms(s1)
            if len(lch) > 0 or nulls:
                print(s1, '<-->', s2, '===', lch)

nlpf.getShortestHypernymPath('parking', 'building', nulls=False)

返回:

Synset('parking.n.01')< - > Synset('building.n.01')=== [Synset('entity.n.01')] Synset('parking.n.01')< - > Synset('construction.n.01')=== [Synset('abstraction.n.06')] Synset('parking.n.01')< - > Synset('construction.n.07 ')=== [Synset('abstraction.n.06')] Synset('parking.n.01')< - > Synset('building.n.04')=== [Synset('abstract。 n.06')] Synset('parking.n.02')< - > Synset('building.n.01')=== [Synset('entity.n.01')] Synset('停车。 n.02')< - > Synset('construction.n.01')=== [Synset('act.n.02')] Synset('parking.n.02')< - > Synset( 'construction.n.07')=== [Synset('act.n.02')] Synset('parking.n.02')< - > Synset('building.n.04')=== [Synset('abstraction.n.06')] Synset('park.v.02')< - > Synset('build.v.05')=== [Synset('control.v.01') ]

所以我至少可以调解它。

© www.soinside.com 2019 - 2024. All rights reserved.