问:如何在 R 中正确取消列出
distribution
类型的数据框列?
在下面的示例中,
generate()
为我提供了我想要的数据形状。但我必须使用 forecast()
(generate()
似乎在组合和/或分层预测方面存在一些问题),这给了我一列 distribution
对象。我不太确定如何使用这些。我提供我的尝试,涉及map()
,pivot_longer()
。但这不可能是正确的方式,对吗?
library(fpp3)
# data example from "Forecasting Principles and Practice", Ch 5.5
# https://otexts.com/fpp3/prediction-intervals.html
gafa_stock |>
filter(Symbol == "GOOG", year(Date) >= 2015) |>
mutate(day = row_number()) |>
update_tsibble(index = day, regular = TRUE) ->
google_stock
google_stock |> filter(year(Date) == 2015) -> google_2015
google_2015 %>% model(NAIVE(Close)) -> fit
# I like this sort of output, which is easy for me to plot
fit %>% generate(h = 30, times = 5, bootstrap = TRUE) -> gen.sim
# # A tsibble: 150 x 6 [1]
# # Key: Symbol, .model, .rep [5]
# Symbol .model .rep day .innov .sim
# <chr> <chr> <chr> <dbl> <dbl> <dbl>
# 1 GOOG NAIVE(Close) 1 253 -0.204 759.
# 2 GOOG NAIVE(Close) 1 254 -0.984 758.
# 3 GOOG NAIVE(Close) 1 255 -3.55 754.
# 4 GOOG NAIVE(Close) 1 256 2.21 756.
# This output is trickier for me, producing a `distribution` column
fit %>% forecast(h = 30, times = 5, bootstrap = TRUE) -> fcs.sim
# # A fable: 30 x 5 [1]
# # Key: Symbol, .model [1]
# Symbol .model day Close .mean
# <chr> <chr> <dbl> <dist> <dbl>
# 1 GOOG NAIVE(Close) 253 sample[5] 752.
# 2 GOOG NAIVE(Close) 254 sample[5] 752.
# 3 GOOG NAIVE(Close) 255 sample[5] 751.
# 4 GOOG NAIVE(Close) 256 sample[5] 769.
# This is my attempt to unlist the `distribution` column `Close`
# It works, but this can't be the right way to do it.
fcs.sim %>%
pull(Close) %>%
map(unlist) %>%
bind_rows() %>%
bind_cols(fcs.sim) %>%
pivot_longer(cols=x1:x5) %>%
mutate(name=str_remove(name,'^x')) %>%
rename(.rep=name) ->
fcs.sim.2.gen.sim
# # A tsibble: 150 x 5 [1]
# # Key: Symbol, .model, .rep [5]
# Symbol .model .rep day .sim
# <chr> <chr> <chr> <dbl> <dbl>
# 1 GOOG NAIVE(Close) 1 253 752.
# 2 GOOG NAIVE(Close) 1 254 758.
# 3 GOOG NAIVE(Close) 1 255 761.
# 4 GOOG NAIVE(Close) 1 256 750.
分布不是数据框列,而是 vctr。
distributional::parameters()
提取分布的参数(在本例中为生成的样本)。
一般来说(不仅仅是分布)你可以使用
vec_data()
获取 vctr 内的数据。