如何在 R 中使用 hmmTMB 对观测状态进行一步预测？

Question

我一直在使用 R 中的 hmmTMB 包来使用隐马尔可夫模型 (HMM) 来建模和预测油价走势。在用训练数据拟合模型后，我正在努力使用测试数据集上的拟合模型对观察到的状态（例如石油价格）进行一步预测。

# Load necessary libraries
library(hmmTMB)
library(tidyverse)

# Set seed for reproducibility
set.seed(123)

# Example dataset preparation (extended for 1 year)
df <- data.frame(
    day = seq.Date(from = as.Date("2020-01-01"), to = as.Date("2020-12-31"), by = "day"),
    OILPRICE = rnorm(366, 100, 10)
    ) %>%
    as_tibble()

# Splitting into training and test sets
split_index <- round(nrow(df) * 0.8)
training_set <- df[1:split_index, ] %>% as.data.frame()
testing_set <- df[(split_index + 1):nrow(df), ] %>% as.data.frame()

# Define a hidden Markov model with 2 states for the training set
hid1 <- MarkovChain$new(data = training_set, n_states = 2)
dists <- list(OILPRICE = "norm")
par0 <- list(OILPRICE = list(mean = c(110, 90), sd = c(1, 1)))
obs1 <- Observation$new(data = training_set, n_states = 2, dists = dists, par = par0)
par0 <- obs1$suggest_initial()
obs1 <- Observation$new(data = training_set, n_states = 2, dists = dists, par = par0)
hmm1 <- HMM$new(obs = obs1, hid = hid1)
hmm1$fit(silent = TRUE)

# Rename
data_plot <- training_set

# Color by most likely state sequence
data_plot$state <- factor(paste0("State ", hmm1$viterbi()))
ggplot(data_plot, aes(day, OILPRICE, col = state)) +
    geom_point() +
    scale_color_manual(values = pal, name = NULL)

# Attempting to make a one-step-ahead prediction
# ??? This is where I need guidance

如何使用 hmmTMB 在我的测试数据集 (testing_set) 上使用拟合模型 (hmm1) 对观察到的状态（例如第二天的 OILPRICE）进行一步预测？我正在寻找一种基于适合训练集的模型来预测未来油价值的方法。

我很欣赏有关如何使用 hmmTMB 包实现此目标的任何见解或示例。

Answer 1

hmmTMB 目前没有内置预测功能（截至 2024 年 2 月）。然而，可以根据拟合的 hmmTMB 模型计算预测分布。

预测分布是状态相关分布的混合，其中权重对应于处于不同状态的概率。因此，您可以使用以下工作流程：

获取最后观察到的数据行（即，在时间 n）的状态分布；我们称其为
```
u
```
得到n+h时刻的状态分布为
```
u %*% (tpm %^% h)
```
，其中
```
tpm
```
是状态过程的（一步）转移概率矩阵；也就是说，我们将
```
u
```
乘以 h 步转移概率矩阵
将预测分布作为估计的状态相关分布的混合，并按上一步中找到的状态概率进行加权

在步骤 3 中，您可以计算预测平均值（作为状态相关平均值的加权和），或预测分布的概率密度函数，如下面的代码所示。

预测时间平均值

黑点显示最后一次观察训练集后 60 天的预测分布平均值。

预测分布

线条显示了最后一次观察训练集后 60 天的预测分布的概率密度函数。正如你所看到的，它很可能一开始处于状态2，因此相应的状态相关分布具有更高的权重。一段时间后，随着状态分布接近平稳（长期）分布，预测分布停止变化。

代码

# Load packages
library(scico)
library(expm)

# Grid of oil prices to plot forecast distributions
grid <- seq(min(training_set$OILPRICE), 
            max(training_set$OILPRICE), 
            length = 100)

# Get state distribution at last (training) observation
sp <- hmm1$state_probs()
u <- sp[nrow(sp),]

# Get model parameters
tpm <- hid1$tpm()[,,1]
obspar <- obs1$par()[,,1]

# Loop over forecast times (over 60 days here)
h <- 1:60
mix <- list()
mix_mean <- rep(NA, length = length(h))
for(i in seq_along(h)) {
    # State distribution at time n + h
    dist_h <- u %*% (tpm %^% h[i])
    
    # Forecast distribution at time n + h
    mix[[i]] <- 
        dist_h[1] * dnorm(x = grid, 
                          mean = obspar["OILPRICE.mean", "state 1"], 
                          sd = obspar["OILPRICE.sd", "state 1"]) +
        dist_h[2] * dnorm(x = grid, 
                          mean = obspar["OILPRICE.mean", "state 2"], 
                          sd = obspar["OILPRICE.sd", "state 2"])
    
    # Mean of forecast distribution
    mix_mean[i] <- sum(dist_h * obspar["OILPRICE.mean",])
}

# Plot the forecast mean against time
df_mean <- data.frame(day = seq.Date(from = as.Date("2020-10-20"), 
                                     by = "day", length = length(h)),
                      mean = mix_mean)

ggplot(data_plot, aes(day, OILPRICE, col = state)) +
    geom_point() +
    geom_point(aes(y = mean, col = NULL), data = df_mean) +
    scale_color_manual(values = pal)

# Plot the forecast distributions for h = 1, 2, ..., 60
df_mix <- data.frame(h = rep(h, each = 100),
                 price = grid,
                 mix = unlist(mix))

ggplot(df_mix, aes(price, mix, col = h, group = h)) +
    geom_line() +
    scale_color_scico(name = "time") +
    labs(y = "forecast distribution")

如何在 R 中使用 hmmTMB 对观测状态进行一步预测？

问题描述投票：0回答：1

1个回答

预测时间平均值

预测分布

代码

最新问题

如何在 R 中使用 hmmTMB 对观测状态进行一步预测？

问题描述 投票：0回答：1

1个回答

预测时间平均值

预测分布

代码

最新问题

问题描述投票：0回答：1