R：使用（嵌套）循环编写具有特定列组合的输出文件

Question

我有以下数据框：

IID.m IID.f score.m measure.m health.m score.f measure.f health.f 
1 2 120 80 8 131 82 5
3 4 121 83 9 119 80 7
5 6 133 78 5 121 87 9
7 8 126 87 8 120 83 4

所以两个ID（比方说，父= IID.m和母= IID.f）和属于第一个ID（score.m，measure.m和health.m）的三个变量以及属于它的三个变量第二个ID（score.f，measure.f和health.f）。

我需要制作包含四列的以下输出文件：

档案1：

score.m health.f health.m score.f

文件2：

measure.m health.f health.m measure.f

档案3：

measure.m score.f score.m measure.f

换句话说：父亲和母亲的三个变量中的两个，按照“父亲的变量1”，“母亲的变量2”，“父亲的变量2”，“母亲的变量1”的顺序排列”。对于所有变量组合，这些都需要是单独的制表符分隔的输出文件。

在这种情况下，这意味着只有三种不同的输出文件，因为只有三种不同的组合（得分+健康，量度+健康，量度+得分）。实际上我有更多的变量，更多可能的组合，这就是为什么我怀疑我需要一个for循环（或for循环中的for循环？）。我如何在R内执行此操作？

Answer 1

考虑运行combn以获得得分，度量和健康列索引的所有组合。然后在lapply中运行该返回列表以构建子集化数据帧。但是，您不希望所有组合，只需要在f和m之间匹配的位置，因此在数据帧列表上运行Filter，并专门运行另一个combn来为grep调用构建变量对。

数据

txt = 'IID.m IID.f score.m measure.m health.m score.f measure.f health.f 
1 2 120 80 8 131 82 5
3 4 121 83 9 119 80 7
5 6 133 78 5 121 87 9
7 8 126 87 8 120 83 4'

df <- read.table(text = txt, header = TRUE)

数据框列表构建

value_combos <- combn(3:ncol(df), 4, simplify = FALSE)

df_list <- lapply(value_combos, function(i) df[, i])

col_pairs <- lapply(combn(unique(gsub("\\.m|\\.f", "", names(df)[-2:-1])), 2, simplify = FALSE),
                    function(i) paste(i, collapse="|"))
col_pairs
# [[1]]
# [1] "score|measure"

# [[2]]
# [1] "score|health"

# [[3]]
# [1] "measure|health"

sub_df_list <-lapply(col_pairs, function(x) 
  Filter(function(d) length(grep(x, names(d))) == 4 , df_list)[[1]])

sub_df_list
# [[1]]
#   score.m measure.m score.f measure.f
# 1     120        80     131        82
# 2     121        83     119        80
# 3     133        78     121        87
# 4     126        87     120        83

# [[2]]
#   score.m health.m score.f health.f
# 1     120        8     131        5
# 2     121        9     119        7
# 3     133        5     121        9
# 4     126        8     120        4

# [[3]]
#   measure.m health.m measure.f health.f
# 1        80        8        82        5
# 2        83        9        80        7
# 3        78        5        87        9
# 4        87        8        83        4

# OUTPUT TAB-DELIMITED FILES FROM LIST
lapply(seq_along(sub_df_list), function(i) 
          write.table(sub_df_list[[i]], file = paste0("Output", i, ".txt"), sep="\t"))

R：使用（嵌套）循环编写具有特定列组合的输出文件

问题描述投票：1回答：1

1个回答

最新问题

R：使用（嵌套）循环编写具有特定列组合的输出文件

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1