Google Colab 中的 ScispaCy

问题描述 投票:0回答:2

我正在尝试使用 colab 中的 ScispaCy 构建 NER 临床数据模型。我已经安装了这样的软件包。

!pip install spacy
!pip install scispacy
!pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.2.4/en_core_sci_md-0.2.4.tar.gz       #pip install <Model URL>```

然后我使用

导入了两者
import scispacy
import spacy
import en_core_sci_md

然后使用以下代码来显示句子和实体

nlp = spacy.load("en_core_sci_md")
text ="""Myeloid derived suppressor cells (MDSC) are immature myeloid cells with immunosuppressive activity. They accumulate in tumor-bearing mice and humans with different types of cancer, including hepatocellular carcinoma (HCC)""" 
doc = nlp(text)
print(list(doc.sents))
print(doc.ents)

我收到以下错误

OSError: [E050] Can't find model 'en_core_sci_md'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

我不知道为什么会出现这个错误,我遵循了 ScispaCy 官方 GitHub 帖子中的所有代码。任何帮助将不胜感激。 预先感谢。

python nlp spacy named-entity-recognition
2个回答
3
投票

我希望我还不算太晚......我相信你已经非常接近正确的方法了。

我会分步写下我的答案,你可以选择在哪里停止。

步骤1)

#Install en_core_sci_lg package from the website of spacy  (large corpus), but you can also use en_core_sci_md for the medium corpus.
       
!pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.2.4/en_core_sci_lg-0.2.4.tar.gz 

步骤2)

# Import the large dataset
import en_core_sci_lg

步骤3)

# Identify entities
nlp = en_core_sci_lg.load()
doc = nlp(text)
displacy_image = displacy.render(doc, jupyter = True, style = "ent")

步骤4)

#Print only the entities
print(doc.ents)

步骤5)

# Save the result 
save_res = [doc.ents]
save_res

步骤6)

#Save the results to a dataframe
df_save_res = pd.DataFrame(save_res)
df_save_res

步骤7)

# In case that you want to visualise the dependency parse
  displacy_image = displacy.render(doc, jupyter = True, style = "dep")

0
投票
 ━━━━━━━━━━━━━━ 500.6/500.6 MB 3.0 MB/s eta 0:00:00

准备元数据(setup.py)...完成 已满足要求:/usr/local/lib/python3.10/dist-packages (0.5.4) 中的 scispacy 错误:找不到满足安装要求的版本(来自版本:无) 错误:找不到可安装的匹配发行版 我明白了

© www.soinside.com 2019 - 2024. All rights reserved.