比较线性模型的交叉验证

问题描述 投票:0回答:1

我有2个线性模型要比较:一个是简化模型,另一个是完整模型。我已经对2个线性模型进行了F检验。但我不知道如何使用比较2个模型的5倍交叉验证来做到这一点。以下代码是我在R中拥有的代码。

library(carData) #make sure this package is installed!
data("Anscombe") #NOTE: lowercase was the data from before, uppercase is this dataset (annoying)
spending = subset(Anscombe,! rownames(Anscombe) %in% c("HI","AK","DC"),c(1,2,4)) ##Data file. 

mod_full = lm(education ~ urban + income, data=spending) ##Full model. 

mod_reduced = lm(education ~ income, data = spending) ## Reduced model. 

FTest = anova(mod_reduced, mod_full)
print(FTest) ##Print out the F-test result. 

r linear-regression cross-validation
1个回答
0
投票

您可以使用以下代码使用5折交叉验证比较2个模型

library(caret)

library(carData) 
data("Anscombe") 
spending = subset(Anscombe,! rownames(Anscombe) %in% c("HI","AK","DC"),c(1,2,4)) ##Data file. 
#Setting for 5-fold cross-validation
trainControl <- trainControl(method="cv", number=5, 
                             savePredictions=TRUE, classProbs=F)

#Full model
set.seed(7) #To have reproducible results
fit.full <- train(education ~ urban + income, data=spending, method="lm", metric="RMSE", 
                             preProc=c("center", "scale"),
                             trControl=trainControl)
#Reduced model
set.seed(7)
fit.reduced <- train(education ~ income, data=spending, method="lm", metric="RMSE", 
                  preProc=c("center", "scale"),
                  trControl=trainControl)

#For comparing 2 models
results <- resamples(list(Full=fit.full,Reduced=fit.reduced))
summary(results)
dotplot(results,scale="free")
# correlation between results
modelCor(results)
splom(results)

#Difference in model predictions
diffs <- diff(results)
#Summarize p-values for pair-wise comparisons
summary(diffs)
© www.soinside.com 2019 - 2024. All rights reserved.