我在 R 中有一个数据框,称为
df
:
A = c("ok","WA","WA","ok","WA")
B = c("WB","ok","ok","ok","WB")
C = c("WC","ok","WC","ok","WC")
df = tibble(A,B,C)
df
A B C
<chr> <chr> <chr>
1 ok WB WC
2 WA ok ok
3 WA ok WC
4 ok ok ok
5 WA WB WC
我想创建(变异)一个新列,它将连接所有不正常的值,如下所示:
A B C D
<chr> <chr> <chr> <chr>
1 ok WB WC WB,WC
2 WA ok ok WA
3 WA ok WC WA,WC
4 ok ok ok NO W
5 WA WB WC WA,WB,WC
您可以在
paste
的帮助下使用 gsub
方法:
A <- c("ok","WA","WA","ok","WA")
B <- c("WB","ok","ok","ok","WB")
C <- c("WC","ok","WC","ok","WC")
df <- data.frame(A=A, B=B, C=C, stringsAsFactors=FALSE)
df$D <- paste(df$A, df$B, df$C, sep=",")
df$D <- gsub("^,|,$", "", gsub(",{2,}", ",", gsub("\\bok\\b", ",", df$D)))
df
A B C D
1 ok WB WC WB,WC
2 WA ok ok WA
3 WA ok WC WA,WC
4 ok ok ok
5 WA WB WC WA,WB,WC
这里的基本策略是删除
ok
条目,然后清理 paste
调用留下的可能悬空逗号。