我有一个名为 df 的数据框:
warning = c("HAS","NO","HAS","HAS","HAS")
validation = c("OK","OK","WARNING","WARNING","WARNING")
a_b_c_messages1 = c(NA,NA,"good catch",NA,NA)
D_E_f_messages2 = c(NA,NA,NA,"NOT BAD",NA)
g_h_I_messages3 = c(NA,NA,NA,NA,"BETTER")
j_k_l_messages4 = c(NA,NA,NA,NA,NA)
df = tibble(warning,validation,a_b_c_messages1,
D_E_f_messages2,g_h_I_messages3,
j_k_l_messages4)
A tibble: 5 × 6
warning validation a_b_c_messages1 D_E_f_messages2 g_h_I_messages3 j_k_l_messages4
<chr> <chr> <chr> <chr> <chr> <lgl>
1 HAS OK NA NA NA NA
2 NO OK NA NA NA NA
3 HAS WARNING good catch NA NA NA
4 HAS WARNING NA NOT BAD NA NA
5 HAS WARNING NA NA BETTER NA
我想过滤此 df 并保留列警告包含单词“HAS”的行以及包含字符串“Message”且不是 NA 的任何列。
理想情况下我希望结果 df 是这样的:
warning validation a_b_c_messages1 D_E_f_messages2 g_h_I_messages3
<chr> <chr> <chr> <chr> <chr>
3 HAS WARNING good catch NA NA
4 HAS WARNING NA NOT BAD NA
5 HAS WARNING NA NA BETTER
warning = c("HAS","NO","HAS","HAS","HAS")
validation = c("OK","OK","WARNING","WARNING","WARNING")
a_b_c_messages1 = c(NA,NA,"good catch",NA,NA)
D_E_f_messages2 = c(NA,NA,NA,"NOT BAD",NA)
g_h_I_messages3 = c(NA,NA,NA,NA,"BETTER")
j_k_l_messages4 = c(NA,NA,NA,NA,NA)
library(tidyverse)
df = tibble(warning,validation,a_b_c_messages1,
D_E_f_messages2,g_h_I_messages3,
j_k_l_messages4)
df %>%
rownames_to_column(var = "rn") %>%
pivot_longer(-c(rn, warning, validation), values_drop_na = T) %>%
pivot_wider(id_cols = c(rn, warning, validation), names_from = name, values_from = value)
#> # A tibble: 3 × 6
#> rn warning validation a_b_c_messages1 D_E_f_messages2 g_h_I_messages3
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3 HAS WARNING good catch <NA> <NA>
#> 2 4 HAS WARNING <NA> NOT BAD <NA>
#> 3 5 HAS WARNING <NA> <NA> BETTER
创建于 2024-11-08,使用 reprex v2.1.1