我有2个线性模型要比较:一个是简化模型,另一个是完整模型。我已经对2个线性模型进行了F检验。但我不知道如何使用比较2个模型的5倍交叉验证来做到这一点。以下代码是我在R
中拥有的代码。
library(carData) #make sure this package is installed!
data("Anscombe") #NOTE: lowercase was the data from before, uppercase is this dataset (annoying)
spending = subset(Anscombe,! rownames(Anscombe) %in% c("HI","AK","DC"),c(1,2,4)) ##Data file.
mod_full = lm(education ~ urban + income, data=spending) ##Full model.
mod_reduced = lm(education ~ income, data = spending) ## Reduced model.
FTest = anova(mod_reduced, mod_full)
print(FTest) ##Print out the F-test result.
您可以使用以下代码使用5折交叉验证比较2个模型
library(caret)
library(carData)
data("Anscombe")
spending = subset(Anscombe,! rownames(Anscombe) %in% c("HI","AK","DC"),c(1,2,4)) ##Data file.
#Setting for 5-fold cross-validation
trainControl <- trainControl(method="cv", number=5,
savePredictions=TRUE, classProbs=F)
#Full model
set.seed(7) #To have reproducible results
fit.full <- train(education ~ urban + income, data=spending, method="lm", metric="RMSE",
preProc=c("center", "scale"),
trControl=trainControl)
#Reduced model
set.seed(7)
fit.reduced <- train(education ~ income, data=spending, method="lm", metric="RMSE",
preProc=c("center", "scale"),
trControl=trainControl)
#For comparing 2 models
results <- resamples(list(Full=fit.full,Reduced=fit.reduced))
summary(results)
dotplot(results,scale="free")
# correlation between results
modelCor(results)
splom(results)
#Difference in model predictions
diffs <- diff(results)
#Summarize p-values for pair-wise comparisons
summary(diffs)