我有两个数据框:
> df1 <- data.frame(date = as.Date( c( "2021-06-01", "2021-06-02", "2021-06-03", "2021-06-04",
"2021-06-05", "2021-06-06", "2021-06-07", "2021-06-08",
"2021-06-09", "2021-06-10", "2021-06-11", "2021-06-12",
"2021-06-13") ),
temperature = c( 17, 30, 28, 29, 16, 21, 20, 11, 28, 29, 25, 26, 19) )
和
> df2 <- data.frame( ID = c( 1 : 4 ),
date.pose = as.Date(c("2021-06-01", "2021-06-03", "2021-06-06", "2021-06-10") ),
date.withdrawal = as.Date(c("2021-06-02", "2021-06-05", "2021-06-09", "2021-06-13") ) )
我想将
df2
中每个时期的平均温度存储在新列 (df2$mean.temperature
) 中。
对于来自
ID = 1
的df2
,平均温度将根据2021-06-01
和2021-06-02
的温度计算,即mean(17, 30)
换句话说,我想得到这个:
> df2 <- data.frame(ID = c( 1 : 4 ),
date.pose = as.Date( c("2021-06-01", "2021-06-03", "2021-06-06", "2021-06-10") ) ,
date.withdrawal = as.Date( c("2021-06-03", "2021-06-06", "2021-06-10", "2021-06-13") ),
mean.Temperature = c(23.5, 24.3, 20.0, 24.8) )
我正在尝试将
df2
中的 ID 添加到 df1
中的新列中。一旦我这样做了,我就可以像这样聚合:
> df3 <- aggregate(df1$temperature, list(df1$ID, df2$date.pose), FUN = mean)
不知道如何在df1中添加对应的ID。 或者也许有更好的方法来做到这一点?
这是一种使用
uncount
中的 tidyr
和一些连接的方法。
df2 %>%
mutate(days = (date.witdrawal - date.pose + 1) %>% as.integer) %>%
tidyr::uncount(days, .id = "row") %>%
transmute(ID, date = date.pose + row - 1) %>%
left_join(df1) %>%
group_by(ID) %>%
summarize(mean.Temperature = mean(temperature)) %>%
right_join(df2)
结果
# A tibble: 4 × 4
ID mean.Temperature date.pose date.witdrawal
<int> <dbl> <date> <date>
1 1 23.5 2021-06-01 2021-06-02
2 2 24.3 2021-06-03 2021-06-05
3 3 20 2021-06-06 2021-06-09
4 4 24.8 2021-06-10 2021-06-13
更新。感谢@Jon Spring:
我们可以这样做:
逻辑:
在长时间旋转 df1 后按日期加入两个 df
arrange
按日期和 fill
然后按 ID 分组后使用 summarise
和 mean()
最后重新加入:
library(dplyr)
library(tidyr)
df2 %>%
pivot_longer(-ID, values_to = "date") %>%
full_join(df1, by= "date") %>%
arrange(date) %>%
fill(ID, .direction = "down") %>%
group_by(ID) %>%
summarise(mean_temp = mean(temperature, na.rm = TRUE)) %>%
left_join(df2, by="ID")
ID mean_temp date.pose date.witdrawal
<int> <dbl> <date> <date>
1 1 23.5 2021-06-01 2021-06-02
2 2 24.3 2021-06-03 2021-06-05
3 3 20 2021-06-06 2021-06-09
4 4 24.8 2021-06-10 2021-06-13