我有一个数据框,它是 PSPP 的平均表。我想重塑它,以便更容易地在计算中操作它。
我想做什么?
我希望第一个变量水平填充,下一个变量垂直填充。请参阅附图了解更多信息。
结果必须考虑到表可能会丢失整个因子水平 - 因为它们可能没有“值”,因此不包含在 csv 形式的输入表中。
我希望这个巨大的编辑现在能更清楚我的要求。
Dput 与发布图像类似的示例 df:
df <- structure(list(structure(c(2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L,
1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "v1", "v2"), class = "factor"),
varA = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 1L, 1L,
2L, 3L, 3L, 4L, 4L), .Label = c("k1", "k2", "k3", "k4", "varA"
), class = "factor"), Age = structure(c(1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L), .Label = c("a1",
"a2", "Age"), class = "factor"), Mean = structure(1:15, .Label = c("10",
"11", "12", "13", "14", "15", "16", "17", "18", "19", "21",
"22", "23", "24", "25", "Mean"), class = "factor"), N = structure(c(1L,
8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 2L, 3L, 4L, 5L, 6L,
7L), .Label = c("1", "10", "12", "13", "14", "15", "16",
"2", "3", "4", "5", "6", "7", "8", "9", "N"), class = "factor")), row.names = 2:16, class = "data.frame")
*更新** 检查: 我的输入和所需的输出: https://postimg.cc/N2GTZd09
我仍然不清楚您的预期输出,因为您的输入数据和预期输出不匹配。
除此之外,也许这就是您所追求的?
library(tidyverse)
df %>%
rename(group = 1) %>% # Name first column
mutate_at(1, na_if, "") %>% # Replace "" with NA
fill(group) %>% # Fill first column with missing values
group_by(group) %>%
nest() %>% # Nest data by group
mutate(data = map(data, ~.x %>%
gather(k, v, -varA, -Age) %>% # Wide to long
unite(k, varA, k) %>% # Unite varA with variable column
spread(k, v))) %>% # Spread from long to wide
unnest() # Unnest
## A tibble: 4 x 10
# group Age k1_Mean k1_N k2_Mean k2_N k3_Mean k3_N k4_Mean k4_N
# <fct> <fct> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#1 v1 a1 10 1 12 3 14 5 16 7
#2 v1 a2 11 2 13 4 15 6 17 8
#3 v2 a1 18 9 NA NA 22 13 24 15
#4 v2 a2 19 10 21 12 23 14 25 16