我有一个数据框,其中有一列指示(调查的)选择,还有一列指示每行中所做选择的索引。例如,
df <- tibble(
record_id = 1:9,
choices = c(rep("1, A | 2, B | 3, C", 3),
rep("1, Apple | 2, Banana | 3, Cherry", 3),
rep("1, America | 2, Belgium | 3, China", 3)),
choice = sample(1:3, size = 9, replace = T)
)
看起来像这样:
# A tibble: 9 × 3
record_id choices choice
<int> <chr> <int>
1 1 1, A | 2, B | 3, C 3
2 2 1, A | 2, B | 3, C 2
3 3 1, A | 2, B | 3, C 3
4 4 1, Apple | 2, Banana | 3, Cherry 3
5 5 1, Apple | 2, Banana | 3, Cherry 3
6 6 1, Apple | 2, Banana | 3, Cherry 2
7 7 1, America | 2, Belgium | 3, China 2
8 8 1, America | 2, Belgium | 3, China 3
9 9 1, America | 2, Belgium | 3, China 3
我想创建一个列,通过
choices
列中指示的标签重新编码选择。例如:
# A tibble: 9 × 3
record_id choices choice label
<int> <chr> <int> <chr>
1 1 1, A | 2, B | 3, C 3 C
2 2 1, A | 2, B | 3, C 2 B
3 3 1, A | 2, B | 3, C 3 C
4 4 1, Apple | 2, Banana | 3, Cherry 3 Cherry
5 5 1, Apple | 2, Banana | 3, Cherry 3 Cherry
6 6 1, Apple | 2, Banana | 3, Cherry 2 Banana
7 7 1, America | 2, Belgium | 3, China 2 Belgium
8 8 1, America | 2, Belgium | 3, China 3 China
9 9 1, America | 2, Belgium | 3, China 3 China
到目前为止,我已经创建了一个函数来重新编码选择,但它无法在管道中进行变异:
make_key <- function(.str) {
lstr <- str_split(.str, pattern = " \\| ")
out <- map(lstr, ~str_remove(.x, pattern = "^([0-9]+), ")) %>% as_vector()
out_names <- map(lstr, ~str_extract(.x, pattern = "^([0-9]+)")) %>% as_vector()
names(out) <- out_names
return(out)
}
# Working example:
my_string <- c("1, A | 2, B | 3, C")
recode(1, !!!make_key(my_string))
[1] "A"
但是当我尝试在调用 dplyr::mutate() 时使用它时,它不起作用。我认为这与将变量名传递给函数有关,但不确定如何。
rowwise(df) %>%
mutate(label = recode(choice, !!!make_key(choices))
)
Error in stri_split_regex(string, pattern, n = n, simplify = simplify, :
object 'choices' not found
我尝试过添加双大括号
{{}}
到lstr <- str_split({{.str}}, pattern = " \\| ")
,以及一些rlang函数来处理这个问题,例如,.str <- rlang::as_name(.str)
或.str <- rlang::enquo(.str)
,但到目前为止没有任何效果。
关于:
library(dplyr)
pick_label <- \(choices, choice){
frags <- unlist(strsplit(choices, ' \\| '))
frags[grepl(paste0('^', choice), frags)] |>
gsub(pattern = '^.*, *', replacement = '')
}
df |>
rowwise() |>
mutate(label = pick_label(choices, choice))