在阅读在线mlr3书籍在R中使用mlr3应用机器学习(https://mlr3book.mlr-org.com/chapters/chapter4/hyperparameter_optimization.html),我遇到了一点困难弄清楚如何确保超参数仅在训练数据上进行优化,并且后续预测仅在测试上发生 数据。 这是代码和初始错误。 请注意,在介绍此代码后,本章将转向使用 auto_tune 命令来执行此操作,但出于我的目的,我需要在这里手动执行此操作。
library(mlr3tuning)
library(mlr3tuningspaces)
library(mlr3learners)
library(mlr3extralearners)
library(e1071)
library(paradox)
#Specifying Task
tsk_sonar = tsk("sonar")
tsk_sonar$set_col_roles("Class", c("target", "stratum"))
#Partitioning Data set into Train and Test Samples
splits = mlr3::partition(tsk_sonar, ratio = 0.80)
#Defining Learner and range of hyperparameters for optimization
learner = lrn("classif.svm",
cost = to_tune(1e-5, 1e5, logscale = TRUE),
gamma = to_tune(1e-5, 1e5, logscale = TRUE),
kernel = "radial",
type = "C-classification"
)
#Specifying the rows constituting the training data set for the learner
learner$train(tsk_sonar, row_ids = splits$train)
> learner$train(tsk_sonar, row_ids = splits$train)
Error in svm.default(x = data, y = task$truth(), probability = (self$predict_type == :
'list' object cannot be coerced to type 'double'
#Specifying Tuning Instance
instance = ti(
task = tsk_sonar,
learner = learner,
resampling = rsmp("cv", folds = 3),
measures = msr("classif.ce"),
terminator = trm("none")
)
# Defining Hyperparamter Search
tuner = tnr("grid_search", resolution = 5, batch_size = 10)
#Running hyperparameter tuning for optimization
tuner$optimize(instance)
#Training the data on the full data set
lrn_svm_tuned = lrn("classif.svm")
lrn_svm_tuned$param_set$values = instance$result_learner_param_vals
#Final trained model for use in prediction
lrn_svm_tuned$train(tsk_sonar)$model
#Create predictions on the test data
prediction = lrn_svm_tuned$predict(tsk_sonar, splits$test)
您发现了一个错误。不可能用参数集中存在的
TuneToken
来训练学习器。这与训练测试分割无关。如果你真的担心这个,你可以在优化后检查instance$archive$benchmark_result$resamplings
中的重采样分割。