我想创建一个alogritmo,允许我说一个树是否包含在另一个树中。感谢this site我管理了一个算法,让我知道二进制树,但我想概括它。
def isSubtree(T,S):
'''
function to say if a tree S is a subtree of another one, T
'''
# Base Case
if S is None:
return True
if T is None:
return True
# Check the tree with root as current node
if (areIdentical(T, S)):
return True
# IF the tree with root as current node doesn't match
# then try left and right subtreee one by one
return isSubtree(T.children, S) or isSubtree(T.children, S) # won't work because we have several children
def areIdentical(root1, root2):
'''
function to say if two roots are identical
'''
# Base Case
if root1 is None and root2 is None:
return True
if root1 is None or root2 is None:
return False
# Check if the data of both roots their and children are the same
return (root1.data == root2.data and
areIdentical(root1.children, root2.children)) # Here problem because it won't work for children
例如 :
>>># first tree creation
>>>text = start becoming popular
>>>textSpacy = spacy_nlp(text1)
>>>treeText = nltk_spacy_tree(textSpacy)
>>>t = WordTree(treeText[0])
>>># second tree creation
>>>question = When did Beyonce start becoming popular?
>>>questionSpacy = spacy_nlp(question)
>>>treeQuestion = nltk_spacy_tree(questionSpacy)
>>>q = WordTree(treeQuestion[0])
>>># tree comparison
>>>isSubtree(t,q)
True
如果这可能有用,这里是我使用的WordTree
类:
class WordTree:
'''Tree for spaCy dependency parsing array'''
def __init__(self, tree, is_subtree=False):
"""
Construct a new 'WordTree' object.
:param array: The array contening the dependency
:param parent: The parent of the array if exists
:return: returns nothing
"""
self.parent = []
self.children = []
self.data = tree.label().split('_')[0] # the first element of the tree # We are going to add the synonyms as well.
for subtree in tree:
if type(subtree) == Tree:
# Iterate through the depth of the subtree.
t = WordTree(subtree, True)
t.parent=tree.label().split('_')[0]
elif type(subtree) == str:
surface_form = subtree.split('_')[0]
self.children.append(surface_form)
它适用于使用Spacy短语制作的树木。
question = "When did Beyonce start becoming popular?"
questionSpacy = spacy_nlp(question)
treeQuestion = nltk_spacy_tree(questionSpacy)
t = WordTree(treeQuestion[0])
你可以迭代T
的所有孩子,如果S
是T
任何一个孩子的子树,那么S
是T
的子树。此外,当False
是T
时,你应该返回None
,因为这意味着你已经在T
的叶子,S
仍然没有被发现是一个子树:
def isSubtree(T,S):
if S is None:
return True
if T is None:
return False
if areIdentical(T, S):
return True
return any(isSubtree(c, S) for c in T.children)