我目前正在参加有关生物导体包的数据营课程。有一个部分可以通过 BioStrings 包来使用寨卡基因组。我想知道在哪里可以加载这个变量?该课程称基因组是从https://www.ncbi.nlm.nih.gov/nuccore/NC_012532.1下载的。在数据营内部,如果我执行 dput(zikaVirus) (会话中的变量是 zikaVirus 我得到
new("DNAStringSet", pool = new("SharedRaw_Pool", xp_list = list(
<pointer: (nil)>), .link_to_cached_object_list = list(<environment>)),
ranges = new("GroupedIRanges", group = 1L, start = 1L, width = 10794L,
NAMES = "NC_012532.1 Zika virus isolate ZIKV/Monkey/Uganda/MR766/1947, complete genome",
elementType = "ANY", elementMetadata = NULL, metadata = list()),
elementType = "DNAString", elementMetadata = NULL, metadata = list())
我无法在 R 中使用它来重新创建变量。
试试这个:
library(rentrez)
library(Biostrings)
tmp <- tempfile()
writeLines(
entrez_fetch(
db = "nuccore",
id = "NC_012532.1",
rettype = "fasta",
retmode = "text"
),
tmp
)
dna <- readDNAStringSet(tmp, format = "fasta")
检查结果:
> print(dna)
DNAStringSet object of length 1:
width seq names
[1] 10794 AGTTGTTGATCTGTGTGAGTCAG...TCGGCGGCCGGTGTGGGGAAATCCATGGTTTCT NC_012532.1 Zika ...
> dput(dna)
new("DNAStringSet", pool = new("SharedRaw_Pool", xp_list = list(
<pointer: (nil)>), .link_to_cached_object_list = list(<environment>)),
ranges = new("GroupedIRanges", group = 1L, start = 1L, width = 10794L,
NAMES = "NC_012532.1 Zika virus, complete genome", elementType = "ANY",
elementMetadata = NULL, metadata = list()), elementType = "DNAString",
elementMetadata = NULL, metadata = list())