我正在使用 MGCV 使用强制通过某个点的三次样条来生成 GAM 的预测。我已经按照这篇文章中的概述生成了游戏:https://stat.ethz.ch/pipermail/r-help/2013-March/350253.html
但是,当我尝试预测 gam 范围内 x 的新值的 y 时,我收到此错误:
Error: variable 'X' was fitted with type "nmatrix.8" but type "numeric" was supplied.
这是我用来尝试做出预测的脚本:
## Fake some data...
library(mgcv)
set.seed(0)
n <- 100
x <- runif(n)*4-1;x <- sort(x);
f <- exp(4*x)/(1+exp(4*x));y <- f+rnorm(100)*0.1;plot(x,y)
dat <- data.frame(x=x,y=y)
## Create a spline basis and penalty, making sure there is a knot
## at the constraint point, (0 here, but could be anywhere)
knots <- data.frame(x=seq(-1,3,length=9)) ## create knots
## set up smoother...
sm <- smoothCon(s(x,k=9,bs="cr"),dat,knots=knots)[[1]]
## 3rd parameter is value of spline at knot location 0,
## set it to 0 by dropping...
X <- sm$X[,-3] ## spline basis
S <- sm$S[[1]][-3,-3] ## spline penalty
off <- y*0 + .6 ## offset term to force curve through (0, .6)
## fit spline constrained through (0, .6)...
b <- gam(y ~ X - 1 + offset(off),paraPen=list(X=list(S)))
lines(x,predict(b))
library(tidyverse)
# Predict values across concentrations within measured range
x_seq <- seq(-1, 3, length.out = 10000)
newdata <- data.frame(X = x_seq)
# Generate the spline basis matrix for the new data
sm_new <- smoothCon(s(X, k=9, bs="cr"), newdata, knots=knots)[[1]]
X_new <- sm_new$X[,-3] # Match the basis matrix structure used in the model
# Create a data frame with the spline basis matrix
X_new_df <- as.data.frame(X_new)
# Predict the offset term for the new data
off_new <- y*0 + .6 # Use the same offset as in the original model
# Combine the new data with the basis matrix and the offset
predict_data <- cbind(newdata, X_new_df, off_new) %>%
rename(off = off_new)
# Predict values using the GAM
predictions <- predict(b, newdata = predict_data, se.fit = TRUE)
我还尝试将函数所需的变量放入列表中,以便可以将它们作为“数据”的参数调用,但这会产生一个单独的问题,其中矩阵(X)无法识别:
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 27, 8
这是我用来构建此模型的调整后的脚本,它会生成错误:
list.dat <- list(y = y,
X = X,
S = S,
off = off)
b <- gam(y ~ X - 1 + offset(off),
data = list.dat, paraPen=list(bigX=list(S)))
你能帮我根据比原始数据更多的 x 值生成 y 的预测吗?我更愿意将我的变量作为数据参数中的列表提供给模型,但我也愿意接受其他建议。
非常感谢您的建议! :)
您现在可以通过其
s()
参数直接使用 pc
执行此操作(对于“点约束”):
library(mgcv)
set.seed(0)
n <- 100
x <- runif(n) * 4 - 1
x <- sort(x)
f <- exp(4 * x) / (1 + exp(4 * x))
y <- f + rnorm(100) * 0.1
dat <- data.frame(x = x, y = y)
m <- gam(y ~ s(x, pc = 0), data = dat, method = "REML")
然后
predict()
等按预期工作:
library("gratia")
ds <- data_slice(m, x = evenly(x))
fv <- fitted_values(m, data = ds) # calls predict
fv
# A tibble: 100 × 6
.row x .fitted .se .lower_ci .upper_ci
<int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 -0.946 0.0364 0.0526 -0.0666 0.140
2 2 -0.907 0.0375 0.0463 -0.0534 0.128
3 3 -0.867 0.0386 0.0406 -0.0410 0.118
4 4 -0.828 0.0402 0.0358 -0.0299 0.110
5 5 -0.788 0.0424 0.0319 -0.0201 0.105
6 6 -0.749 0.0455 0.0292 -0.0118 0.103
7 7 -0.709 0.0497 0.0277 -0.00467 0.104
8 8 -0.670 0.0554 0.0272 0.00202 0.109
9 9 -0.630 0.0627 0.0273 0.00923 0.116
10 10 -0.591 0.0720 0.0276 0.0179 0.126
# ℹ 90 more rows
# ℹ Use `print(n = ...)` to see more rows
我们可以通过平滑图确认点约束:
draw(m) # or plot(m)