R- 查找累计总和高于阈值的日期

问题描述 投票:0回答:1

我正在研究作物生长的热需求。我有一个表,其中包含 6 个月时间段内的累积温度。示例如下:

         date      temp   cum_temp
 1: 2020-03-01  9.339748   9.339748
 2: 2020-03-02 23.860849  33.200597
 3: 2020-03-03 12.860331  46.060928
 4: 2020-03-04 26.607505  72.668432
 5: 2020-03-05 28.273551 100.941984
 6: 2020-03-06  2.321138 103.263122
 7: 2020-03-07 16.315059 119.578181
 8: 2020-03-08 26.880152 146.458334
 9: 2020-03-09 16.991615 163.449949
10: 2020-03-10 14.241827 177.691776
11: 2020-03-11 28.748167 206.439943
12: 2020-03-12 14.146691 220.586634
13: 2020-03-13 20.649548 241.236182
14: 2020-03-14 17.606369 258.842551
15: 2020-03-15  3.984816 262.827367

然后,我还有一张表格,其中列出了作物生长阶段及其热需求(即达到每个阶段所需的热阈值):

 growth_stage thermal_req
1:           VE         120
2:           V2         200
3:           V3         350
4:        V5-V6         475
5:        V7-V9         610
6:           R2        1660
7:           R4        1925
8:           R5        2450
9:           R6        2700

根据这些表格,我需要两个结果:

  1. 查看温度表,用达到热要求的日期更新热要求表。对于这个例子,它看起来像这样:
  growth_stage thermal_req date_reached
1:           VE         120   2020-03-08
2:           V2         200   2020-03-11
3:           V3         350   2020-03-21
4:        V5-V6         475   2020-03-26
5:        V7-V9         610   2020-04-03
6:           R2        1660   2020-06-14
7:           R4        1925   2020-06-30
8:           R5        2450   2020-08-06
9:           R6        2700   2020-08-23
  1. 根据第二个表中显示的热要求更新原始温度表,添加新的“生长阶段”列。它看起来像这样:
         date      temp   cum_temp  growth_stage
 1: 2020-03-01  9.339748   9.339748 NA
 2: 2020-03-02 23.860849  33.200597 NA
 3: 2020-03-03 12.860331  46.060928 NA
 4: 2020-03-04 26.607505  72.668432 NA
 5: 2020-03-05 28.273551 100.941984 NA
 6: 2020-03-06  2.321138 103.263122 NA
 7: 2020-03-07 16.315059 119.578181 NA
 8: 2020-03-08 26.880152 146.458334 VE
 9: 2020-03-09 16.991615 163.449949 VE
10: 2020-03-10 14.241827 177.691776 VE
11: 2020-03-11 28.748167 206.439943 V2
12: 2020-03-12 14.146691 220.586634 V2
13: 2020-03-13 20.649548 241.236182 V2
14: 2020-03-14 17.606369 258.842551 V2
15: 2020-03-15  3.984816 262.827367 V2
16: 2020-03-16 27.094924 289.922291 V2
17: 2020-03-17  8.136544 298.058835 V2
18: 2020-03-18  2.219726 300.278562 V2
19: 2020-03-19 10.509701 310.788263 V2
20: 2020-03-20 28.680606 339.468868 V2
21: 2020-03-21 26.796640 366.265509 V3
22: 2020-03-22 21.091299 387.356807 V3
23: 2020-03-23 19.574698 406.931505 V3
24: 2020-03-24 29.833824 436.765328 V3
25: 2020-03-25 20.015468 456.780797 V3
26: 2020-03-26 21.547384 478.328180 V5-V6
27: 2020-03-27 16.777915 495.106095 V5-V6
28: 2020-03-28 18.230119 513.336214 V5-V6
29: 2020-03-29  9.385632 522.721846 V5-V6
30: 2020-03-30  5.266296 527.988142 V5-V6
31: 2020-03-31 28.927703 556.915844 V5-V6
32: 2020-04-01 27.166672 584.082517 V5-V6
33: 2020-04-02 21.030453 605.112970 V5-V6
34: 2020-04-03 24.068555 629.181525 V5-V6
35: 2020-04-04  1.713797 630.895322 V5-V6
36: 2020-04-05 14.856083 645.751405 V5-V6
37: 2020-04-06 22.995327 668.746732 V5-V6
38: 2020-04-07  7.275830 676.022562 V5-V6
39: 2020-04-08 10.227249 686.249811 V5-V6
40: 2020-04-09  7.717148 693.966959 V5-V6
          date      temp   cum_temp growth_stage

实现这些成果的最佳方法是什么?

用于重现此问题的数据:

# load required packages
library(data.table)

# generate data
dates <- seq(as.Date("2020-03-01"), as.Date("2020-08-31"), by="days")
set.seed(123); temps <- runif(length(dates), min=1, max=30)
dat <- data.table(date=dates,
                  temp=temps)
# cumulative sum
dat$cum_temp <- cumsum(dat$temp)

# table with growth stage thermal requirements 
sum_req <- data.table(growth_stage=c("VE","V2","V3","V5-V6","V7-V9","R2","R4","R5","R6"),
                      thermal_req=c(120,200,350,475,610,1660,1925,2450,2700))
r date data.table threshold
1个回答
0
投票

可能是更好的方法,但如果日期有序且温度总是升高,则可能很简单:

dat$growth_stage <- NA

for (i in 1:nrow(sum_req)) {
  indices <- which(dat$cum_temp >= sum_req$thermal_req[i])
  dat$growth_stage[indices] <- sum_req$growth_stage[i]
}

查找数据集中哪些行的温度高于阈值, 然后将相应的生长阶段写入该栏。一开始就很多;当我们进入不同的成长阶段时,此列会被覆盖。

再次,在这里工作,可能不能很好地概括

© www.soinside.com 2019 - 2024. All rights reserved.