如何循环文件,然后匹配部分字符串并获取相应的值R.

问题描述 投票:0回答:1

这一定非常简单,但我真的很困惑。我在/mydir有很多文件。例如-

list.files("/mydir")中的文件:

myfiles <- c("new_Ago2_1_LTR_assembly.csv", "new_Ago2_2_LTR_assembly.csv", 
"new_DCLd_1_LTR_assembly.csv", "new_DCLd_2_LTR_assembly.csv", "not_wanted_files")

所有这些文件都采用以下格式:

  length      A      C      G      T
1     18   1890   3328   1646   3067
2     19   4444   8221   4914   8668
3     20  12090  18073  12903  19726
4     21  38719  35510  30843  41125

我想循环遍历myfiles中的所有文件,并通过下面的列表(Ago2_1,Ago2_2,DCLd_1,DCLd_2)与文件名进行部分匹配,我想在下面的getvalue中放置相应的值。数据框。

Ago2_1  <-  29,911,751
Ago2_2  <-  29,564,885
DCLd_1  <-  67,004,254
DCLd_2  <-  77,682,528

getvalue <-      #this is where I am confused- how can I do the partial match with the file name and put the respective value?!!

这是代码:

  for (i in 1:length(myfiles)){


      df<- read.table("myfiles[i]", header= TRUE)

      df<- df[,c("length","A","C","G","T")]


      test<- cbind(df$length,(df[,c("A","T","G","C")]/(getvalue [????Need help Here!])))
   ##Additional  FUNCTION . to be executed!!
    }
r
1个回答
0
投票

我们可以创建命名向量,然后使用grep来使用grepl获取索引:

# lookup values 
myValues <- setNames(c(29911751, 29564885, 67004254, 77682528), 
                     c("Ago2_1","Ago2_2","DCLd_1","DCLd_2"))

# For loop start
# testing
# i = myfiles[1]

# get value
ix <- which(sapply(names(myValues), function(j)grepl(j, i)))
getvalue <- myValues[ ix ]

# For loop end
© www.soinside.com 2019 - 2024. All rights reserved.