我试图使用基因表达数据用R建立基本的散点图。
#import data:
oldmice <- read.table("oldmice.txt", header = TRUE)
youngmice <- read.table("youngmice.txt", header = TRUE)
导入的数据包含:两个导入的数据的格式相同,但是MGE具有不同的值。
gene MGE
Sox17 -6.74193774617653
Mrpl15 -0.212567471203473
Lypla1 -0.711251006455475
and so on..
使用以下方法制作的基本火山图:youngmice $ MGE与oldmice $ MGE
plot(oldmice$MGE, youngmice$MGE, main="old vs young mice!!",
xlab="oldmice$MGE ", ylab="youngmice$MGE ", pch=19)
我的问题是如何将Multiple_gene_lists中的“ genes”着色为oldmice $ MGE,youngmice $ MGE? (应将在multi_gene_lists中的唯一multi_gene_list标记为oldmice $ MGE,youngmice $ MGE)
这是我的多基因清单
multiple_gene_list <- read.table("multiple_gene_list.txt", header = TRUE)
multiple_gene_list <- as.vector(multiple_gene_list )
multiple_gene_list包含:
gene
Six6
Arl2
Tmem74B
Rab9B
Rasgef1B
Ccne1
Apln
Spag7
C17Orf59
Krtap4-4
而且我的目标是仅在oldmice $ MGE,youngmice $ MGE中标记multiple_gene_list。我也尝试了以下代码,但失败了!
with(subset(ASC_oldmice_exprs, ASC_oldmice_exprs$gene %in% multiple_gene_list$gene), points(ASC_youngmice_exprs$MGE, pch=20, col="red"))
谢谢!
让我们获取一些数据:
multiple_gene_list =structure(list(gene = structure(c(8L, 2L, 10L, 6L, 7L, 4L, 1L,
9L, 3L, 5L), .Label = c("Apln", "Arl2", "C17Orf59", "Ccne1",
"Krtap4-4", "Rab9B", "Rasgef1B", "Six6", "Spag7", "Tmem74B"),
class = "factor")), class = "data.frame", row.names = c(NA,
-10L))
set.seed(111)
oldmice = data.frame(
gene=c("Six6","Arl2","Tmem74B",letters[1:10]),
MGE=runif(13))
youngmice = data.frame(
gene=c("Six6","Arl2","Tmem74B",letters[1:10]),
MGE=runif(13))
有3个重叠,我们定义的颜色如下:
COLS = ifelse(oldmice$gene %in% multiple_gene_list$gene,
"turquoise","orange")
和情节:
plot(oldmice$MGE, youngmice$MGE, main="old vs young mice!!",
xlab="oldmice$MGE ", ylab="youngmice$MGE ", pch=19,col=COLS)
sel = oldmice$gene %in% multiple_gene_list$gene
text(x=oldmice$MGE[sel]+0.01,
y=youngmice$MGE[sel]+0.01,
oldmice$gene[sel])