嵌套rollmean - 如何避免每个DF边缘的NA

问题描述 投票:0回答:1

我正在使用rollmean()获得4年和5年的时间序列数据平均值。平均值不能跨组计算(df中的“等级”),所以我使用的是purrr::nest()

我知道我可以用0替换值,但是我想知道没有0填充的其他方法是什么?

我正在寻找的最终结果将是分组变量的滚动均值,尽可能少的NA - 我觉得我的方法在最后一点失败了。

数据:

csr_ <- tribble(~Year_, ~Grade, ~AttndRise,
    2016,"K5", 1.0000000,
    2017,"K5", 1.0000000,
    2018,"K5", 0.7562500,
    2016, "Gr. 1", 0.9448276,
    2017, "Gr. 1", 1.0000000,
   2018, "Gr. 1", 0.7625000,
    2016, "Gr. 2", 1.0000000,
     2017, "Gr. 2", 1.0000000,
   2018, "Gr. 2", 0.8709677,
 2016, "Gr. 3", 1.1240876,
  2017, "Gr. 3", 1.0000000,
  2018, "Gr. 3", 0.8467153,
 2016, "Gr. 4", 0.7857143,
  2017, "Gr. 4", 1.0000000,
  2018, "Gr. 4", 0.9635036,
  2016, "Gr. 5", 0.7685950,
  2017, "Gr. 5", 1.0000000,
 2018, "Gr. 5", 0.9480519,
 2016, "Gr. 6", 0.9462366,
 2017, "Gr. 6", 1.0000000,
 2018, "Gr. 6", 1.0247934)

处理

csr_ %>% 
  group_by(Grade) %>% 
  nest() %>% 
  mutate(data = map(data, ~ .x %>% 
                      mutate(four_year = rollmean(x= AttndRise, k = 3, align = "center", fill = NA )))) %>%
  unnest()

##result

# A tibble: 21 x 4
   Grade Year_ AttndRise four_year
   <chr> <dbl>     <dbl>     <dbl>
  K5     2016     1        NA    
  K5     2017     1         0.919
  K5     2018     0.756    NA    
  Gr. 1  2016     0.945    NA    
  Gr. 1  2017     1         0.902
  Gr. 1  2018     0.762    NA    
  Gr. 2  2016     1        NA    
  Gr. 2  2017     1         0.957
  Gr. 2  2018     0.871    NA    
 Gr. 3  2016     1.12     NA    
# … with 11 more rows


#note that this smaller data set I have reduced k to 3 instead of 4 & 5.

对于较小的数据集,结果更加夸大,我们可以看到嵌套DF(每个等级的第一年和最后一年)的每个“边缘”如何被赋予NA值:

enter image description here

谢谢!

r dplyr purrr zoo
1个回答
1
投票
csr %>% 
  group_by(Grade) %>% 
  arrange(Grade, desc(Year)) %>% 
  nest() %>% 
  mutate(data = map(data, ~ .x %>% 
                      mutate(four_year = rollapply(AttndRise, 4, mean, partial = TRUE),
                             five_year = rollapply(AttndRise, 5, mean, partial = TRUE))
                    )
         ) %>%
  unnest()

结果:

# A tibble: 378 x 9
   Grade FakeCrudeBirthRate FakeFertilityRate Year    AttndRise Year_ Cohort     four_year five_year
   <fct>              <dbl>             <dbl> <chr>       <dbl> <dbl> <chr>          <dbl>     <dbl>
 1 K5                  11.9              2.28 2018-19     0.756  2018 Elementary     0.919     0.919
 2 K5                  11.9              2.28 2017-18     1      2017 Elementary     0.939     0.939
 3 K5                  11.9              2.28 2016-17     1      2016 Elementary     1         0.951
 4 K5                  11.9              2.28 2015-16     1      2015 Elementary     1         1    
 5 K5                  11.9              2.28 2014-15     1      2014 Elementary     1         1    
 6 K5                  11.9              2.28 2013-14     1      2013 Elementary     1         1    
 7 K5                  11.9              2.28 2012-13     1      2012 Elementary     0.992     0.994
 8 K5                  11.9              2.28 2011-12     1      2011 Elementary     1.01      1.01 
 9 K5                  11.9              2.28 2010-11     0.969  2010 Elementary     1.05      1.04 
10 K5                  11.9              2.28 2009-10     1.07   2009 Elementary     1.05      1.04 
© www.soinside.com 2019 - 2024. All rights reserved.