我想在 datasummary$StationID 中有 datatossedtest$StationID 的匹配项(见下文)的情况下,将一个数据框 (df) 与多行连接起来。然后,我需要将 datatossedtest 数据框中匹配的行中的某些列压缩到 datasummary 数据框中的一列中。
测试数据:
datatossedtest <- tibble(`WID` = c("10A", "11A", "11A", "12A", "10A"), `StationID` = c("A", "B", "B", "AB", "C"), `Issue` = c("Bad", "Not Good", "Bad", "Meh", "Meh"), 'n' = c(7, 3, 6, 5, 4))
datasummary <- tibble(`WID` = c("10A", "11A","12A", "10A", "13A"), `StationID` = c("A", "B","AB","C","D"))
我尝试使用 mutate() 和 glue() 来做到这一点,但由于没有一对一的匹配,我遇到了问题。
datasummary <- datasummary %>%
mutate(comment = ifelse(datasummary$StationID %in% datatossedtest$StationID, glue("{datatossedtest$n} sample(s) were thrown out because of {datatossedtest$Issue}"),"Lookin good"))
这是我上面代码的输出:
> datasummary
# A tibble: 5 × 3
WID StationID comment
<chr> <chr> <chr>
1 10A A 7 sample(s) were thrown out because of Bad
2 11A B 3 sample(s) were thrown out because of Not Good
3 12A AB 6 sample(s) were thrown out because of Bad
4 10A C 5 sample(s) were thrown out because of Meh
5 13A D Lookin good
虽然这非常接近我想要的,但我需要从 datatossedtest with StationID matches 到 datasummary 的“问题”的所有变体也包含在评论栏中。请在下面查看我想要的输出。
> datasummary
# A tibble: 5 × 3
WID StationID comment
<chr> <chr> <chr>
1 10A A 7 sample(s) were thrown out because of Bad
2 11A B 3 sample(s) were thrown out because of Not Good AND 6 sample(s) were thrown out because of Bad
3 12A AB 6 sample(s) were thrown out because of Bad
4 10A C 5 sample(s) were thrown out because of Meh
5 13A D Lookin good
感谢您的宝贵时间!
你想为此使用 join。
连接将为每场比赛提供一行(因此 WID 11A,B 站将有 2 行),您可以使用
group_by
/summarise
将每个 WID/StationID 组合回一行。
library(tidyverse)
datatossedtest <- tibble(`WID` = c("10A", "11A", "11A", "12A", "10A"), `StationID` = c("A", "B", "B", "AB", "C"), `Issue` = c("Bad", "Not Good", "Bad", "Meh", "Meh"), 'n' = c(7, 3, 6, 5, 4))
datasummary <- tibble(`WID` = c("10A", "11A","12A", "10A", "13A"), `StationID` = c("A", "B","AB","C","D"))
datasummary %>%
left_join(datatossedtest, by = c("WID", "StationID")) %>%
mutate(comment = ifelse(is.na(n), "Lookin good", paste(n, "sample(s) were thrwon out because of", Issue))) %>%
select(WID, StationID, comment) %>%
group_by(WID, StationID) %>%
summarise(comment = paste(comment, collapse = " AND ")) %>%
ungroup()
#> `summarise()` has grouped output by 'WID'. You can override using the `.groups`
#> argument.
#> # A tibble: 5 × 3
#> WID StationID comment
#> <chr> <chr> <chr>
#> 1 10A A 7 sample(s) were thrwon out because of Bad
#> 2 10A C 4 sample(s) were thrwon out because of Meh
#> 3 11A B 3 sample(s) were thrwon out because of Not Good AND 6 sample(…
#> 4 12A AB 5 sample(s) were thrwon out because of Meh
#> 5 13A D Lookin good
创建于 2023-05-19 与 reprex v2.0.2