以下是两个表格
Table1
Date OldPrice NewPrice
2014-06-12 09:32:56 0 10
2014-06-27 16:13:36 10 12
2014-08-12 22:41:47 12 13
Table2
Date Qty
2014-06-15 18:09:23 5
2014-06-19 12:04:29 4
2014-06-22 13:21:34 3
2014-06-29 19:01:22 6
2014-07-01 18:02:33 3
2014-09-29 22:41:47 6
我想以这种方式显示结果
Date OldPrice NewPrice Qty
2014-06-12 09:32:56 0 10 0
2014-06-27 16:13:36 10 12 12
2014-08-12 22:41:47 12 13 15
我用了这个命令
for(i in 1:nrow(Table1)){
startDate = Table1$Date[i]
endDate = Table1$Date[i+1]
code=aggregate(list(Table2$Qty),
by=list(Table1$Date, Table1$OldPrice, Table1$NewPrice, Date = Table2$Date > startDate & Table2$Date <= endDate), FUN=sum)
}
我希望数量在第一个表中的给定日期之间聚合,即在第一个和第二个日期之间,第二个和第三个日期之间等等。此外,最后一个日期需要聚合到另一个表中的日期结束。
提前致谢!
与你的previous one相比,我明白你在这个问题中的意思
这为您提供了您提供的确切输出:
# Date OldPrice NewPrice Quantity
#1 2014-06-12 09:32:56 0 10 0
#2 2014-06-27 16:13:36 10 12 12
#3 2014-08-12 22:41:47 12 13 15
由以下代码生成(参见说明):
#your data & preps
df1 <- read.table(text=
"'Date' 'OldPrice' 'NewPrice'
'2014-06-12 09:32:56' '0' '10'
'2014-06-27 16:13:36' '10' '12'
'2014-08-12 22:41:47' '12' '13'", stringsAsFactors=F,header=T)
df2 <- read.table(text=
"'Date' 'Qty'
'2014-06-15 18:09:23' '5'
'2014-06-19 12:04:29' '4'
'2014-06-22 13:21:34' '3'
'2014-06-29 19:01:22' '6'
'2014-07-01 18:02:33' '3'
'2014-09-29 22:41:47' '6'" , stringsAsFactors=F, header=T)
df1$Date <- as.POSIXct(df1$Date); df2$Date <- as.POSIXct(df2$Date) #convert into datetime formats
df1 <- df1[with(df1, order(Date)),] #order df1 by Date
values <- vector("list", length = nrow(df1)+1) #declare a list of specific length of df1+1
out_of_time_dates_before <- c(); out_of_time_dates_after <- c() #here will be dates that come before or after dates available in df1
names(values) <- c(1:(length(values)-2), "out_of_time_dates_before", "out_of_time_dates_after")
#producing the main outputs
for(j in 1:nrow(df2)){
print(paste0("Being processed: ", df2$Date[j]))
for(i in 1:(nrow(df1)-1)){
if(df2$Date[j]>df1$Date[i] & df2$Date[j]<df1$Date[i+1]){
values[[i]] <- append(values[[i]], df2$Qty[j])
}
}
if(df2$Date[j]<min(df1$Date)){
out_of_time_dates_before <- append(out_of_time_dates_before, df2$Qty[j])
values[["out_of_time_dates_before"]] <- append(values[["out_of_time_dates_before"]], df2$Qty[j])
} else if(df2$Date[j] > max(df1$Date)){
out_of_time_dates_after <- append(out_of_time_dates_after, df2$Qty[j])
values[["out_of_time_dates_after"]] <- append(values[["out_of_time_dates_after"]], df2$Qty[j])
}
}
#aggregating the quantity for the date ranges and all that falls before or after the date ranges not available in df1
df1$Quantity <- c(0, sapply(values, sum)[1:(nrow(df1)-1)]) #replace the leading quantity value with 0 (as per your example)
df1$Quantity[1] <- df1$Quantity[1]+sapply(values, sum)["out_of_time_dates_before"]
df1$Quantity[length(df1$Quantity)] <- df1$Quantity[length(df1$Quantity)]+sapply(values, sum)["out_of_time_dates_after"]
我认为你有一些有趣的问题需要解决一下将它传达给SO社区的一些不幸的方式。例如,不清楚如何处理df2中df1中任何可用范围之前的日期,因此在上面的代码中,我将这些数量(如果有的话)添加到第一个日期。这更通用,类似于您对df2中日期在df1中的日期范围之后的预期(将它们添加到df1的最后日期)。