根据最近日期更新非缺失值[重复]

问题描述 投票:0回答:1

我的数据对每个 ID 有多个观察结果。在 ID 级别,我想将所有值转换为最新的非缺失值。我尝试过使用 mutate、group_by(id) 和which.max(year),但没有成功。

数据:

data <- data.frame(
  id=c(1,1,2,2,3,3,4,4,5,5),
  year=rep(c(2010, 2011), 5),
  employ=c("yes", "yes", "no", "yes", "yes", "no", NA, "yes", "no", NA))

> data
   id year employ
1   1 2010    yes
2   1 2011    yes
3   2 2010     no
4   2 2011    yes
5   3 2010    yes
6   3 2011     no
7   4 2010   <NA>
8   4 2011    yes
9   5 2010     no
10  5 2011   <NA>

所需输出:

data2 <- data.frame(
  id=c(1,1,2,2,3,3,4,4,5,5),
  year=c(2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2010, 2010),
  employ=c("yes", "yes", "yes", "yes", "no", "no","yes", "yes","no", "no"))

> data2
   id year employ
1   1 2011    yes
2   1 2011    yes
3   2 2011    yes
4   2 2011    yes
5   3 2011     no
6   3 2011     no
7   4 2011    yes
8   4 2011    yes
9   5 2010     no
10  5 2010     no
r dplyr data.table
1个回答
2
投票

一个

data.table
选项

setDT(data)[, employ := last(na.omit(employ[order(year)])), id]

给予

    id year employ
 1:  1 2010    yes
 2:  1 2011    yes
 3:  2 2010    yes
 4:  2 2011    yes
 5:  3 2010     no
 6:  3 2011     no
 7:  4 2010    yes
 8:  4 2011    yes
 9:  5 2010     no
10:  5 2011     no

一种

dplyr
方式可能是

data %>%
  group_by(id) %>%
  mutate(employ = last(na.omit(employ[order(year)])))

这给出了

      id  year employ
   <dbl> <dbl> <chr>
 1     1  2010 yes
 2     1  2011 yes
 3     2  2010 yes
 4     2  2011 yes
 5     3  2010 no
 6     3  2011 no
 7     4  2010 yes
 8     4  2011 yes
 9     5  2010 no
10     5  2011 no
© www.soinside.com 2019 - 2024. All rights reserved.