在 dplyr 流程中引用当前 data.frame

问题描述 投票:0回答:2

如何在 dpylr 流中引用当前的 data.frame?举个例子,在

library(dplyr)

myresults = tribble(
  ~dataset_name, ~method_group, ~method, ~value,
  'iris',        'other',       'a',     1,
  'wine',        'other',       'b',     2,
  'iris',        'mine',        'c',     3,
  'wine',        'mine',        'd',     4
)

myresults %>%
  mutate(dataset_name='datasets aggregated') %>%
  bind_rows(XXX %>% filter(method=='c') %>% mutate(method_group = 'other'))

我想将当前的 data.frame 与其自身进行行绑定。除了 XXX 我应该写什么?

在函数

do()
中,答案似乎是
.
。尽管这不是很优雅,而且我不想使用 do,但我设法获得了想要的结果

myresults %>%
  mutate(dataset_name='datasets aggregated') %>%
  do(bind_rows(data.frame(.), data.frame(.) %>% filter(method=='c') %>% mutate(method_group = 'other')))

但这不太好。

我的R版本是:

> R.version
               _                           
platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          3                           
minor          4.4                         
year           2018                        
month          03                          
day            15                          
svn rev        74408                       
language       R                           
version.string R version 3.4.4 (2018-03-15)
nickname       Someone to Lean On 
r dplyr
2个回答
7
投票

我看到的三个选项:

  1. .
    移至
    filter
    内,因为它似乎知道要做什么:

    myresults %>%
      mutate(dataset_name='datasets aggregated') %>%
      bind_rows(filter(., method=='c') %>% mutate(method_group = 'other'))
    # # A tibble: 5 x 4
    #   dataset_name        method_group method value
    #   <chr>               <chr>        <chr>  <dbl>
    # 1 datasets aggregated other        a          1
    # 2 datasets aggregated other        b          2
    # 3 datasets aggregated mine         c          3
    # 4 datasets aggregated mine         d          4
    # 5 datasets aggregated other        c          3
    
  2. 使用临时变量,中间管道:

    z <- myresults %>% mutate(dataset_name='datasets aggregated')
    bind_rows(z, z %>% filter(method=='c') %>% mutate(method_group = 'other'))
    # # A tibble: 5 x 4
    #   dataset_name        method_group method value
    #   <chr>               <chr>        <chr>  <dbl>
    # 1 datasets aggregated other        a          1
    # 2 datasets aggregated other        b          2
    # 3 datasets aggregated mine         c          3
    # 4 datasets aggregated mine         d          4
    # 5 datasets aggregated other        c          3
    
  3. 与您的

    do
    实现类似。 (您不需要
    data.frame(.)
    ,这有点多余,但是
    do
    显然不会替换嵌套管道中
    .
    的实例。)

    myresults %>%
      mutate(dataset_name='datasets aggregated') %>%
      do({dat <- .; bind_rows(dat, dat %>% filter(method=='c') %>% mutate(method_group = 'other'))})
    # # A tibble: 5 x 4
    #   dataset_name        method_group method value
    #   <chr>               <chr>        <chr>  <dbl>
    # 1 datasets aggregated other        a          1
    # 2 datasets aggregated other        b          2
    # 3 datasets aggregated mine         c          3
    # 4 datasets aggregated mine         d          4
    # 5 datasets aggregated other        c          3
    

0
投票

比 @r2evans 的解决方案更通用的解决方案是使用匿名函数

myresults %>%
  mutate(dataset_name='datasets aggregated') %>%
  (\(.) bind_rows(., filter(., method=='c') %>% mutate(method_group = 'other')))() #NOTE : `()` at the end is optional for the `dplyr` pipe but compulsory with the base R pipe (|>) 
© www.soinside.com 2019 - 2024. All rights reserved.