我希望能够为dplyr
动词定义参数
condition <- "dist > 50"
然后在dplyr
函数中使用这些字符串:
require(ggplot2)
ds <- cars
ds1 <- ds %>%
filter (eval(condition))
ds1
但它会引发错误
Error: filter condition does not evaluate to a logical vector.
代码应评估为:
ds1<- ds %>%
filter(dist > 50)
ds1
导致 :
DS1
speed dist
1 14 60
2 14 80
3 15 54
4 18 56
5 18 76
6 18 84
7 19 68
8 20 52
9 20 56
10 20 64
11 22 66
12 23 54
13 24 70
14 24 92
15 24 93
16 24 120
17 25 85
如何在dplyr
动词中传递字符串作为参数?
从这些2014答案开始,使用rlang's quasiquotation可以实现两种新方法。
传统的硬编码过滤器语句。为了便于比较,声明dist > 50
直接包含在dplyr::filter()
中。
library(magrittr)
# The filter statement is hard-coded inside the function.
cars_subset_0 <- function( ) {
cars %>%
dplyr::filter(dist > 50)
}
cars_subset_0()
结果:
speed dist
1 14 60
2 14 80
3 15 54
4 18 56
...
17 25 85
使用NSE的rlang方法(非标准评估)。如Programming with dplyr插图中所述,dist > 50
语句由rlang::enquo()
处理,“使用一些黑魔法查看参数,查看用户输入的内容,并将该值作为一个结果返回”。然后rlang的!!
取消引用输入“以便在周围环境中立即进行评估”。
# The filter statement is evaluated with NSE.
cars_subset_1 <- function( filter_statement ) {
filter_statement_en <- rlang::enquo(filter_statement)
message("filter statement: `", filter_statement_en, "`.")
cars %>%
dplyr::filter(!!filter_statement_en)
}
cars_subset_1(dist > 50)
结果:
filter statement: `~dist > 50`.
<quosure>
expr: ^dist > 50
env: global
speed dist
1 14 60
2 14 80
3 15 54
4 18 56
17 25 85
传递一个字符串的rlang方法。声明"dist > 50"
作为显式字符串传递给函数,并由rlang::parse_expr()
解析为表达式,然后由!!
取消引用。
# The filter statement is passed a string.
cars_subset_2 <- function( filter_statement ) {
filter_statement_expr <- rlang::parse_expr(filter_statement)
message("filter statement: `", filter_statement_expr, "`.")
cars %>%
dplyr::filter(!!filter_statement_expr)
}
cars_subset_2("dist > 50")
结果:
filter statement: `>dist50`.
speed dist
1 14 60
2 14 80
3 15 54
4 18 56
...
17 25 85
与dplyr::select()
相比,事情更简单。显式字符串只需要!!
。
# The select statement is passed a string.
cars_subset_2b <- function( select_statement ) {
cars %>%
dplyr::select(!!select_statement)
}
cars_subset_2b("dist")
在下一版本的dplyr中,它可能会像这样工作:
condition <- quote(dist > 50)
mtcars %>%
filter_(condition)
虽然他们正在努力,但这是一个使用if
的解决方法:
library(dplyr)
library(magrittr)
ds <- data.frame(attend = c(1:5,NA,7:9,NA,NA,12))
filter_na <- FALSE
filtertest <- function(x,filterTF = filter_na){
if(filterTF) x else !(x)
}
ds %>%
filter(attend %>% is.na %>% filtertest)
attend
1 1
2 2
3 3
4 4
5 5
6 7
7 8
8 9
9 12
filter_na <- TRUE
ds %>%
filter(attend %>% is.na %>% filtertest)
attend
1 NA
2 NA
3 NA