在dplyr谓词中将字符串作为参数传递

问题描述 投票:11回答:3

我希望能够为dplyr动词定义参数

condition <- "dist > 50"

然后在dplyr函数中使用这些字符串:

require(ggplot2)
ds <- cars
ds1 <- ds %>%
   filter (eval(condition))
ds1

但它会引发错误

Error: filter condition does not evaluate to a logical vector. 

代码应评估为:

  ds1<- ds %>%
     filter(dist > 50)
  ds1

导致 :

DS1

   speed dist
1     14   60
2     14   80
3     15   54
4     18   56
5     18   76
6     18   84
7     19   68
8     20   52
9     20   56
10    20   64
11    22   66
12    23   54
13    24   70
14    24   92
15    24   93
16    24  120
17    25   85

Question:

如何在dplyr动词中传递字符串作为参数?

r string dplyr parameter-passing data-manipulation
3个回答
3
投票

从这些2014答案开始,使用rlang's quasiquotation可以实现两种新方法。

传统的硬编码过滤器语句。为了便于比较,声明dist > 50直接包含在dplyr::filter()中。

library(magrittr)

# The filter statement is hard-coded inside the function.
cars_subset_0 <- function( ) {
  cars %>%
    dplyr::filter(dist > 50)
}
cars_subset_0()

结果:

   speed dist
1     14   60
2     14   80
3     15   54
4     18   56
...
17    25   85

使用NSE的rlang方法(非标准评估)。如Programming with dplyr插图中所述,dist > 50语句由rlang::enquo()处理,“使用一些黑魔法查看参数,查看用户输入的内容,并将该值作为一个结果返回”。然后rlang的!!取消引用输入“以便在周围环境中立即进行评估”。

# The filter statement is evaluated with NSE.
cars_subset_1 <- function( filter_statement ) {
  filter_statement_en <- rlang::enquo(filter_statement)
  message("filter statement: `", filter_statement_en, "`.")

  cars %>%
    dplyr::filter(!!filter_statement_en)
}
cars_subset_1(dist > 50)

结果:

filter statement: `~dist > 50`.
<quosure>
expr: ^dist > 50
env:  global
   speed dist
1     14   60
2     14   80
3     15   54
4     18   56
17    25   85

传递一个字符串的rlang方法。声明"dist > 50"作为显式字符串传递给函数,并由rlang::parse_expr()解析为表达式,然后由!!取消引用。

# The filter statement is passed a string.
cars_subset_2 <- function( filter_statement ) {
  filter_statement_expr <- rlang::parse_expr(filter_statement)
  message("filter statement: `", filter_statement_expr, "`.")

  cars %>%
    dplyr::filter(!!filter_statement_expr)
}
cars_subset_2("dist > 50")

结果:

filter statement: `>dist50`.
   speed dist
1     14   60
2     14   80
3     15   54
4     18   56
...
17    25   85

dplyr::select()相比,事情更简单。显式字符串只需要!!

# The select statement is passed a string.
cars_subset_2b <- function( select_statement ) {
  cars %>%
    dplyr::select(!!select_statement)
}
cars_subset_2b("dist")

14
投票

在下一版本的dplyr中,它可能会像这样工作:

condition <- quote(dist > 50)

mtcars %>%
   filter_(condition)

3
投票

虽然他们正在努力,但这是一个使用if的解决方法:

library(dplyr)
library(magrittr)

ds <- data.frame(attend = c(1:5,NA,7:9,NA,NA,12))

filter_na <- FALSE

filtertest <- function(x,filterTF = filter_na){
  if(filterTF) x else !(x)
}

ds %>%
  filter(attend %>% is.na %>% filtertest)

  attend
1      1
2      2
3      3
4      4
5      5
6      7
7      8
8      9
9     12

filter_na <- TRUE
ds %>%
  filter(attend %>% is.na %>% filtertest)

  attend
1     NA
2     NA
3     NA
© www.soinside.com 2019 - 2024. All rights reserved.