我有以下样本:
var_1 <- c("1", "15", "35", "abc")
var_1_is_deleted <- c("Yes", "No", "No", "Yes")
var_2 <- c("xyz", NA, NA, "sos")
var_2_is_deleted <- c("No", NA, NA, "Yes")
var_3 <- c(NA, NA, "hello", "def")
var_3_is_deleted <- c(NA, NA, "Yes", "No")
df <- data.frame(var_1, var_1_is_deleted, var_2, var_2_is_deleted, var_3, var_3_is_deleted)
df
var_1 var_1_is_deleted var_2 var_2_is_deleted var_3 var_3_is_deleted
1 1 Yes xyz No <NA> <NA>
2 15 No <NA> <NA> <NA> <NA>
3 35 No <NA> <NA> hello Yes
4 abc Yes sos Yes def No
var_x
列的数量不固定,它在 1 到 9 之间变化,具体取决于我正在使用的数据帧。
我想创建一个包含第一个
var_x
值的新列,其对应的 var_x_is_deleted
等于 No
,如下所示:
var_1 var_1_is_deleted var_2 var_2_is_deleted var_3 var_3_is_deleted **var**
1 1 Yes xyz No <NA> <NA> **xyz**
2 15 No <NA> <NA> <NA> <NA> **15**
3 35 No <NA> <NA> hello Yes **35**
4 abc Yes sos Yes def No **def**
如何使用基础 R 或使用 Tidyverse 包来实现此结果?
检查第二个值是否为“否”,然后按索引子集:
df[ c(TRUE, FALSE) ][ replace(df[ c(FALSE, TRUE) ], is.na(df[ c(FALSE, TRUE) ]), "x") == "No" ]
# [1] "15" "35" "xyz" "def"