我在前 3 列上有重复项,我想在
pivot_wider
转置后保留它们,但不是列表格式。初始数据与重复项:
dat0 <-
structure(list(id = c("P1", "P1", "P1", "P1", "P1", "P1", "P1",
"P1", "P1", "P1", "P2", "P2", "P2", "P2", "P2", "P2"), analyte = c("A",
"A", "B", "B", "B", "B", "C", "C", "D", "D", "B", "B", "B", "B",
"D", "D"), analyzer = c("X", "Y", "X", "Y", "X", "Y", "X", "Y",
"X", "Y", "X", "Y", "X", "Y", "X", "Y"), result = c(0.7, 0.9,
1.26, 1.23, 1.24, 1.22, 5.7, 5.3, 4.1, 4.2, 1.22, 1.23, 1.21,
1.22, 4.4, 4.5)), row.names = c(NA, -16L), class = c("tbl_df",
"tbl", "data.frame"))
pivot_wider 运行后会产生什么,并显示以下消息:
dat1 <- dat0 %>%
pivot_wider(names_from = analyzer, values_from = result)
Values from `result` are not uniquely identified; output will contain list-cols.
• Use `values_fn = list` to suppress this warning.
• Use `values_fn = {summary_fun}` to summarise duplicates.
• Use the following dplyr code to identify duplicates.
{data} |>
dplyr::summarise(n = dplyr::n(), .by = c(id, analyte, analyzer)) |>
dplyr::filter(n > 1L)
所需的输出,具有重复项:
感谢您的帮助
您可以在分组数据中使用
row_number()
来区分重复项,例如,
dat0 %>%
mutate(grp = row_number(), .by = c(id, analyte, analyzer)) %>%
pivot_wider(names_from = analyzer, values_from = result) %>%
select(-grp)
这给出了
# A tibble: 8 × 4
id analyte X Y
<chr> <chr> <dbl> <dbl>
1 P1 A 0.7 0.9
2 P1 B 1.26 1.23
3 P1 B 1.24 1.22
4 P1 C 5.7 5.3
5 P1 D 4.1 4.2
6 P2 B 1.22 1.23
7 P2 B 1.21 1.22