我正在使用rollmean()
获得4年和5年的时间序列数据平均值。平均值不能跨组计算(df中的“等级”),所以我使用的是purrr::nest()
。
我知道我可以用0替换值,但是我想知道没有0填充的其他方法是什么?
我正在寻找的最终结果将是分组变量的滚动均值,尽可能少的NA - 我觉得我的方法在最后一点失败了。
数据:
csr_ <- tribble(~Year_, ~Grade, ~AttndRise,
2016,"K5", 1.0000000,
2017,"K5", 1.0000000,
2018,"K5", 0.7562500,
2016, "Gr. 1", 0.9448276,
2017, "Gr. 1", 1.0000000,
2018, "Gr. 1", 0.7625000,
2016, "Gr. 2", 1.0000000,
2017, "Gr. 2", 1.0000000,
2018, "Gr. 2", 0.8709677,
2016, "Gr. 3", 1.1240876,
2017, "Gr. 3", 1.0000000,
2018, "Gr. 3", 0.8467153,
2016, "Gr. 4", 0.7857143,
2017, "Gr. 4", 1.0000000,
2018, "Gr. 4", 0.9635036,
2016, "Gr. 5", 0.7685950,
2017, "Gr. 5", 1.0000000,
2018, "Gr. 5", 0.9480519,
2016, "Gr. 6", 0.9462366,
2017, "Gr. 6", 1.0000000,
2018, "Gr. 6", 1.0247934)
处理
csr_ %>%
group_by(Grade) %>%
nest() %>%
mutate(data = map(data, ~ .x %>%
mutate(four_year = rollmean(x= AttndRise, k = 3, align = "center", fill = NA )))) %>%
unnest()
##result
# A tibble: 21 x 4
Grade Year_ AttndRise four_year
<chr> <dbl> <dbl> <dbl>
K5 2016 1 NA
K5 2017 1 0.919
K5 2018 0.756 NA
Gr. 1 2016 0.945 NA
Gr. 1 2017 1 0.902
Gr. 1 2018 0.762 NA
Gr. 2 2016 1 NA
Gr. 2 2017 1 0.957
Gr. 2 2018 0.871 NA
Gr. 3 2016 1.12 NA
# … with 11 more rows
#note that this smaller data set I have reduced k to 3 instead of 4 & 5.
对于较小的数据集,结果更加夸大,我们可以看到嵌套DF(每个等级的第一年和最后一年)的每个“边缘”如何被赋予NA
值:
谢谢!
csr %>%
group_by(Grade) %>%
arrange(Grade, desc(Year)) %>%
nest() %>%
mutate(data = map(data, ~ .x %>%
mutate(four_year = rollapply(AttndRise, 4, mean, partial = TRUE),
five_year = rollapply(AttndRise, 5, mean, partial = TRUE))
)
) %>%
unnest()
结果:
# A tibble: 378 x 9
Grade FakeCrudeBirthRate FakeFertilityRate Year AttndRise Year_ Cohort four_year five_year
<fct> <dbl> <dbl> <chr> <dbl> <dbl> <chr> <dbl> <dbl>
1 K5 11.9 2.28 2018-19 0.756 2018 Elementary 0.919 0.919
2 K5 11.9 2.28 2017-18 1 2017 Elementary 0.939 0.939
3 K5 11.9 2.28 2016-17 1 2016 Elementary 1 0.951
4 K5 11.9 2.28 2015-16 1 2015 Elementary 1 1
5 K5 11.9 2.28 2014-15 1 2014 Elementary 1 1
6 K5 11.9 2.28 2013-14 1 2013 Elementary 1 1
7 K5 11.9 2.28 2012-13 1 2012 Elementary 0.992 0.994
8 K5 11.9 2.28 2011-12 1 2011 Elementary 1.01 1.01
9 K5 11.9 2.28 2010-11 0.969 2010 Elementary 1.05 1.04
10 K5 11.9 2.28 2009-10 1.07 2009 Elementary 1.05 1.04