如何使用测试统计信息分布对列名称进行1000个排列?

问题描述 投票:0回答:1

假设我有一个像这样的矩阵

dat <- read.table(text = "   code.1 code.2 code.3 code.4
1     82     93     NA     NA
2     15     85     93     NA
3     93     89     NA     NA
4     81     NA     NA     NA",
                  header = TRUE, stringsAsFactors = FALSE)
dat2=data.matrix(dat)

实际上,我的矩阵有132列和大约15000行。我的列名称如下所示:NoD_14569_norm.1 NoD_14569_norm.2 NoD_14569_norm.3 NoD_14581_30mM.1 NoD_14581_30mM.2 NoD_14581_30mM.3

[我想做的是为我的列名称创建1000个随机排列,其中矩阵中的所有内容都将保持不变,除了列名称会重新排列。

例如,一个列名称的排列/改组将给我这个:

  code.2 code.4 code.1 code.3
1     82     93     NA     NA
2     15     85     93     NA
3     93     89     NA     NA
4     81     NA     NA     NA

目标是在每1000个数据帧上执行以下代码

subject="all_replicate"
targets<-readTargets(paste(PhenotypeDir,"hg_sg_",subject,"_target.txt", sep=''))
Treat <- factor(targets$Treatment,levels=c("C","T"))
Replicates <- factor(targets$rep)
design <- model.matrix(~Replicates+Treat)
corfit <- duplicateCorrelation(dat2, block = targets$Subject)
corfit$consensus.correlation
fit <-lmFit(dat2,design,block=targets$Subject,correlation=corfit$consensus.correlation)
fit<-eBayes(fit)
y1=topTable(fit, coef="TreatT", n=nrow(genes),adjust.method="BH",genelist=genes)

在y1内部有包含p值的列名P.value,我想绘制上述所有1000个列名排列的分布。

请告知

r permutation
1个回答
0
投票

列名的随机排序很容易:

set.seed(42)
# manyorders <- replicate(1000, sample(colnames(dat2)), simplify=FALSE)
# set.seed(42)
manyorders <- replicate(1000, sample(colnames(dat2)), simplify=FALSE)
head(manyorders)
# [[1]]
# [1] "code.4" "code.3" "code.1" "code.2"
# [[2]]
# [1] "code.3" "code.2" "code.4" "code.1"
# [[3]]
# [1] "code.3" "code.4" "code.1" "code.2"
# [[4]]
# [1] "code.4" "code.1" "code.3" "code.2"
# [[5]]
# [1] "code.4" "code.1" "code.3" "code.2"
# [[6]]
# [1] "code.4" "code.1" "code.2" "code.3"

从这里,您可以执行以下操作之一:

### 1, rename-in-copy
for (ord in manyorders) {
  tmpdat <- `colnames<-`(dat2, ord) # copies and renames in one line ... code-golf
  # ... your code
}

### 2, rename in place
for (ord in manyorders) {
  colnames(dat2) <- ord
  # ... your code
}

也有lapply个变体,尽管其中许多变体都涉及预先填写重命名矩阵的列表。如果您的内存不足,则可能要避免这种情况(这是上述复制时重命名建议的推动力。

© www.soinside.com 2019 - 2024. All rights reserved.