在 R 中,来自以下示例 df:
df <- as.data.frame(ID= c("ACTA", "ACTZ", "APHT", "ACTA", "ACTZ", "APHT"),
date = c("2011-12-21", "2011-12-20", "2011-12-20", "2011-12-21", "2011-12-20", "2011-12-20", "2011-12-20"),
time = c("07:07:40", "07:08:20", "07:10:09", "07:10:43", "07:11:32", "07:12:32"),
weight_type = c("weight 1", "weight 1", "weight 1", "weight 2","weight 2", "weight 2"),
combined_weights = c(73.40, 77.70, 73.10, 71.80, 69.60, 68.60))
我想重新创建一个新的 df 来:
换句话说,我希望新 df 中的每一行都呈现特定日期期间特定 ID 的数据(即两次和测量的身体测量值)。
最终结果应该是:
df_new <- as.data.frame(ID= c("ACTA", "ACTZ", "APHT"),
date = c("2011-12-21", "2011-12-20", "2011-12-20"),
time_weight_1 = c("07:07:40", "07:08:20", "07:10:09"),
time_weight_2 = c("07:10:43", "07:11:32", "07:12:32"),
weight_1 = c(73.40, 77.70, 73.10),
weight_2 = c(71.80, 69.60, 68.60))
我尝试了“转置”和聚合函数。但我认为这不是正确的方向。
非常感谢!
可能的选择:
library(tidyverse)
df <- data.frame(
ID = c("ACTA", "ACTZ", "APHT", "ACTA", "ACTZ", "APHT"),
date = c("2011-12-21", "2011-12-20", "2011-12-20", "2011-12-21", "2011-12-20", "2011-12-20"),
time = c("07:07:40", "07:08:20", "07:10:09", "07:10:43", "07:11:32", "07:12:32"),
weight_type = c("weight 1", "weight 1", "weight 1", "weight 2", "weight 2", "weight 2"),
combined_weights = c(73.40, 77.70, 73.10, 71.80, 69.60, 68.60)
)
df |>
mutate(weight_type = str_replace(weight_type, " ", "_")) |>
pivot_wider(names_from = "weight_type", values_from = c("time", "combined_weights")) |>
rename_with(\(x) str_remove(x, "combined_weights_"), starts_with("comb"))
#> # A tibble: 3 × 6
#> ID date time_weight_1 time_weight_2 weight_1 weight_2
#> <chr> <chr> <chr> <chr> <dbl> <dbl>
#> 1 ACTA 2011-12-21 07:07:40 07:10:43 73.4 71.8
#> 2 ACTZ 2011-12-20 07:08:20 07:11:32 77.7 69.6
#> 3 APHT 2011-12-20 07:10:09 07:12:32 73.1 68.6
创建于 2024-03-17,使用 reprex v2.1.0