如何将特定年份的值推广到组的每一行？

Question

我有一个大型数据集，其中包含不同国家/地区每年的值。我想创建一个附加列，其中包含该国家/地区最近一年的值。

我尝试通过 ifelse 循环过滤最近几年的数据集（几乎所有内容都有 2023 年或 2022 年），然后将其与主数据集重新合并，但它给了我很多重复项，我不明白为什么。关于如何不依赖过滤/合并方法有什么想法吗？

使用这个模拟数据：

countries <- c('A','A','A','B','B','C')

years <- c('2000','2001','2002','1999','2001','2000')

values <- c(1,3,1,2,3,2)

df <- data.frame(countries,years,values)

身份证	年份	价值
A	2000	1
A	2001	3
A	2002	1
B	1999	2
B	2001	3
C	2001	2

我想获得以下结果，并在 df 中附加最新值：

身份证	年份	价值	最新值
A	2000	1	1
A	2001	3	1
A	2002	1	1
B	1999	2	3
B	2001	3	3
C	2001	2	2

Answer 1

按 last

 分组后使用

countries

。

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

countries <- c('A','A','A','B','B','C')
years <- c('2000','2001','2002','1999','2001','2000')
values <- c(1,3,1,2,3,2)
df <- data.frame(countries,years,values)

df %>%
  arrange(countries, years) %>%
  mutate(Latest_Value = last(values), .by = countries)
#>   countries years values Latest_Value
#> 1         A  2000      1            1
#> 2         A  2001      3            1
#> 3         A  2002      1            1
#> 4         B  1999      2            3
#> 5         B  2001      3            3
#> 6         C  2000      2            2

^{创建于 2024 年 10 月 10 日，使用 reprex v2.1.1}

Answer 2

你可以试试

> library(dplyr)

> df %>% mutate(latest_Value = values[which.max(years)], .by = countries)
  countries years values latest_Value
1         A  2000      1            1
2         A  2001      3            1
3         A  2002      1            1
4         B  1999      2            3
5         B  2001      3            3
6         C  2000      2            2

Answer 3

按国家/地区拆分

data.frame

，使用

which.max

获取每个国家/地区的最新值，然后通过使用新向量

data.frame

计算年份，将它们分配到原始

match

中的新列：

latests <- sapply(split(df, df$countries), \(x) x$value[which.max(x$years)])
df$Latest_Value <- latests[match(df$countries, names(latests))]

> df
  countries years values Latest_Value
1         A  2000      1            1
2         A  2001      3            1
3         A  2002      1            1
4         B  1999      2            3
5         B  2001      3            3
6         C  2000      2            2

如何将特定年份的值推广到组的每一行？

问题描述投票：0回答：3

3个回答

最新问题

如何将特定年份的值推广到组的每一行？

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3