我有一个像这样的数据框:
df = data.frame(
repetition = c(1,1,1,2,2,2),
group = c("a", "b", "c", "a", "b", "c"),
res = c(10,8,9,12,9,10)
)
repetition group res
1 1 a 10
2 1 b 8
3 1 c 9
4 2 a 12
5 2 b 9
6 2 c 10
现在,对于每次重复,我想计算每组的差异(在
res
中)。所以a组和b组、a组和c组、b组和c组之间的差异。然而,这可能有更多的群体。
我想要这样的输出:
repetition diff_a_b diff_a_c diff_b_c
1 1 2 1 -1
2 2 3 2 -1
我想要像
diff_a_b
、diff_a_c
这样的专栏。 dplyr(或其他任何)有办法做到这一点吗?
或许你可以尝试一下
df %>%
summarise(diff = list(setNames(
as.data.frame(t(-combn(res, 2, diff))),
paste0("diff_", combn(group, 2, paste0, collapse = "_"))
)), .by = repetition) %>%
unnest(diff)
这给出了
# A tibble: 2 × 4
repetition diff_a_b diff_a_c diff_b_c
<dbl> <dbl> <dbl> <dbl>
1 1 2 1 -1
2 2 3 2 -1
我们可以通过
repetition
split
,然后在所有 diff
ations 上运行 combn
。
> t(sapply(split(df$res, df$repetition), \(x) -combn(x, 2, FUN=diff)))
[,1] [,2] [,3]
1 2 1 -1
2 3 2 -1
只是一些化妆品:
> data.frame(repetition=unique(df$repetition),
+ `colnames<-`(t(sapply(split(df$res, df$repetition), \(x) -combn(x, 2, FUN=diff))),
+ paste0('diff_', combn(unique(df$group), 2, paste, collapse='_'))))
repetition diff_a_b diff_a_c diff_b_c
1 1 2 1 -1
2 2 3 2 -1
数据:
> dput(df)
structure(list(repetition = c(1, 1, 1, 2, 2, 2), group = c("a",
"b", "c", "a", "b", "c"), res = c(10, 8, 9, 12, 9, 10)), class = "data.frame", row.names = c(NA,
-6L))