我正在处理包含生物医学实体的文本。然而 medspacy 软件包未能检测到这些:
import medspacy
nlp = medspacy.load()
text = "The patient was treated with warfarin for atrial fibrillation."
doc = nlp(text)
for ent in doc.ents:
print(f"Entity: {ent.text}, Label: {ent.label_}")
print("Total Entities: " + str(len(doc.ents)))
输出: 实体总数:0
我正在尝试检查是否存在用于实体检测的任何管道:
print(nlp.pipe_names)
,我可以布局管道:['medspacy_pyrush', 'medspacy_target_matcher', 'medspacy_context']
。你能帮我找到这个问题的解决方案吗?谢谢!
TargetMatcher
来检测实体。这是代码片段:
import medspacy
from medspacy.ner import TargetMatcher, TargetRule
# Load medspacy pipeline
nlp = medspacy.load()
# Initialize a TargetMatcher
target_matcher = nlp.get_pipe("medspacy_target_matcher")
# Create TargetRule objects for entity patterns
target_rules = [
TargetRule("warfarin", "MEDICATION"),
TargetRule("atrial fibrillation", "CONDITION")
]
target_matcher.add(target_rules)
其余代码保持不变:
text = "The patient was treated with warfarin for atrial fibrillation."
doc = nlp(text)
for ent in doc.ents:
print(f"Entity: {ent.text}, Label: {ent.label_}")
print("Total Entities: " + str(len(doc.ents)))
输出:
Entity: warfarin, Label: MEDICATION
Entity: atrial fibrillation, Label: CONDITION
Total Entities: 2