假设我在R中有1列和n行的DNA序列。我需要用每个元素的反向补码替换列中的每个元素。什么是最好的方法?
使用Biostrings::reverseComplement
:
# Sample data
set.seed(2017);
df <- cbind.data.frame(
DNA = sapply(sample(5:25), function(x)
paste(sample(c("A", "C", "T", "G"), x, replace = TRUE), collapse = ""))
);
require(Biostrings);
df$revComp <- sapply(df$DNA, function(x) as.character(reverseComplement(DNAString(x))));
head(df);
# DNA revComp
#1 CGGAGTCACGACAGTAAATTTAAG CTTAAATTTACTGTCGTGACTCCG
#2 TATGGTGCTTCAGTG CACTGAAGCACCATA
#3 ACTGTCAATTGTA TACAATTGACAGT
#4 GGCCTCGAGT ACTCGAGGCC
#5 GAAAACAGTTTGAGAGAG CTCTCTCAAACTGTTTTC
#6 GAATACTAAAAGAGTCA TGACTCTTTTAGTATTC