我的数据如下所示:
df<-tibble( date= seq.Date(as.Date("2021-01-01"), as.Date("2022-02-01"), by = "month"), val1 = c(105, 105, 105, 125, 125, 125, 125, 132, 132, 132, 135, 150, 150, 150), val2 = c(100, 100, 100, 125, 125, 125, 125, 125, 125, 125, 125, 150, 150, 150), diff = val1-val2 )
我正在尝试制作以下内容:
output<-tibble( date= seq.Date(as.Date("2021-01-01"), as.Date("2022-02-01"), by = "month"), val1 = c(105, 105, 105, 125, 125, 125, 125, 132, 132, 132, 135, 150, 150, 150), val2 = c(100, 100, 100, 125, 125, 125, 125, 125, 125, 125, 125, 150, 150, 150), diff = val1-val2, diff_calc = c(0, 0, 0, 5, 5, 5, 5, 5, 5, 5, 5, 22, 22, 22) )
其中
diff_calc
是 diff
中先前唯一值的累积总和,其中从前一个 diff
值的 diff
= 0 开始进行求和,并重复直到 diff
为 0,并且之前的唯一 diff
值再次累计求和。
我尝试过不同的滞后和连接组合,但我在这里真的很挣扎。谢谢!
看起来这可行,但肯定会寻求更好的解决方案。
library(timetk)
library(tidyverse)
df1<-df%>%filter(diff!=0)%>%group_by(diff)%>% top_n(1, date)%>%ungroup%>% mutate( diff_sum=cumsum(diff), date1=lead(date), date_diff=interval(date, date1) %/% months(1), pay_date = date %+time% "1 month" )%>%filter(date_diff!=1|is.na(date_diff))%>% select(pay_date, diff_sum)%>% pad_by_time( .date_var = pay_date, .by = "month",.start_date = "2021-01-01", .end_date = "2022-02-01" )%>%fill(diff_sum, .direction = "down")
df2<-df%>% left_join(df1, by = c("date" = "pay_date"))
但绝对开放