需要帮助使用 mutate() 和 ifelse() 根据不同数据框中的匹配项或多个匹配项创建列

问题描述 投票:0回答:1

我想在 datasummary$StationID 中有 datatossedtest$StationID 的匹配项(见下文)的情况下,将一个数据框 (df) 与多行连接起来。然后,我需要将 datatossedtest 数据框中匹配的行中的某些列压缩到 datasummary 数据框中的一列中。

测试数据:

datatossedtest <- tibble(`WID` = c("10A", "11A", "11A", "12A", "10A"), `StationID` = c("A", "B", "B", "AB", "C"), `Issue` = c("Bad", "Not Good", "Bad", "Meh", "Meh"), 'n' = c(7, 3, 6, 5, 4))

datasummary <- tibble(`WID` = c("10A", "11A","12A", "10A", "13A"), `StationID` = c("A", "B","AB","C","D"))

我尝试使用 mutate() 和 glue() 来做到这一点,但由于没有一对一的匹配,我遇到了问题。

datasummary <- datasummary %>% 
  mutate(comment = ifelse(datasummary$StationID %in% datatossedtest$StationID, glue("{datatossedtest$n} sample(s) were thrown out because of {datatossedtest$Issue}"),"Lookin good"))

这是我上面代码的输出:

> datasummary
# A tibble: 5 × 3
  WID   StationID comment                                        
  <chr> <chr>     <chr>                                          
1 10A   A         7 sample(s) were thrown out because of Bad     
2 11A   B         3 sample(s) were thrown out because of Not Good
3 12A   AB        6 sample(s) were thrown out because of Bad     
4 10A   C         5 sample(s) were thrown out because of Meh     
5 13A   D         Lookin good  

虽然这非常接近我想要的,但我需要从 datatossedtest with StationID matches 到 datasummary 的“问题”的所有变体也包含在评论栏中。请在下面查看我想要的输出。

> datasummary
# A tibble: 5 × 3
  WID   StationID comment                                        
  <chr> <chr>     <chr>                                          
1 10A   A         7 sample(s) were thrown out because of Bad     
2 11A   B         3 sample(s) were thrown out because of Not Good AND 6 sample(s) were thrown out because of Bad
3 12A   AB        6 sample(s) were thrown out because of Bad     
4 10A   C         5 sample(s) were thrown out because of Meh     
5 13A   D         Lookin good  

感谢您的宝贵时间!

r if-statement dplyr tibble
1个回答
0
投票

你想为此使用 join

连接将为每场比赛提供一行(因此 WID 11A,B 站将有 2 行),您可以使用

group_by
/
summarise
将每个 WID/StationID 组合回一行。

library(tidyverse)

datatossedtest <- tibble(`WID` = c("10A", "11A", "11A", "12A", "10A"), `StationID` = c("A", "B", "B", "AB", "C"), `Issue` = c("Bad", "Not Good", "Bad", "Meh", "Meh"), 'n' = c(7, 3, 6, 5, 4))

datasummary <- tibble(`WID` = c("10A", "11A","12A", "10A", "13A"), `StationID` = c("A", "B","AB","C","D"))

datasummary %>%
    left_join(datatossedtest, by = c("WID", "StationID")) %>%
    mutate(comment = ifelse(is.na(n), "Lookin good", paste(n, "sample(s) were thrwon out because of", Issue))) %>%
    select(WID, StationID, comment) %>%
    group_by(WID, StationID) %>%
    summarise(comment = paste(comment, collapse = " AND ")) %>%
    ungroup()
#> `summarise()` has grouped output by 'WID'. You can override using the `.groups`
#> argument.
#> # A tibble: 5 × 3
#>   WID   StationID comment                                                       
#>   <chr> <chr>     <chr>                                                         
#> 1 10A   A         7 sample(s) were thrwon out because of Bad                    
#> 2 10A   C         4 sample(s) were thrwon out because of Meh                    
#> 3 11A   B         3 sample(s) were thrwon out because of Not Good AND 6 sample(…
#> 4 12A   AB        5 sample(s) were thrwon out because of Meh                    
#> 5 13A   D         Lookin good

创建于 2023-05-19 与 reprex v2.0.2

最新问题
© www.soinside.com 2019 - 2025. All rights reserved.