针对多类目标变量的XGBoost的超参数调整

问题描述 投票:0回答:1

我有一个试图使用XG-Boost解决的多分类问题(必须预测1,2或3)。我正在尝试使用随机搜索微调我的参数。这是我的代码:

我已经尝试将'param_distributions'中的'得分'参数从'auc_roc'更改为'precision','f1_samples','jaccard”(由于存在多类问题,这引发了另一个与“ average”参数相关的错误)。

loss=['hinge','log','modifier_huber','squared_hinge','perceptron']
penalty = ['li','l2','elasticnet']
alpha = [0.0001, 0.001,0.01,0.1,1,10,100,1000]
learnin_rate = ['constant','optimal','invscaling','adaptive']
class_weight = [{0.3,0.5,0.2},{0.3,0.4,0.3}]
eta0 = [1,10,100]

xg_class = xgb.XGBClassifier(objective = "multi:softmax", colsample_bytree = 1,
gamma = 1,subsample = 0.8, learning_rate = 0.01, max_depth = 3,
alpha = 10,n_estimators = 1000, multilabel_ =True, num_classes = 3)

from sklearn.metrics import jaccard_score

param_distributions = dict(loss = loss, penalty=penalty, alpha=alpha, learnin_rate=learnin_rate, class_weight=class_weight, eta0=eta0)
random = RandomizedSearchCV(estimator = xg_class, param_distributions=param_distributions, 
scoring = jaccard_score(y_true=Y_miss_xgb_test, y_pred = preds_miss_xgb, average = 'micro'),
verbose = 1, n_jobs =-1, n_iter = 1000)

random_result = random.fit(X_miss_xgb_train, Y_miss_xgb_train)

我得到的错误是

ValueError:得分应为单个字符串,也可以为单个度量评估或字符串列表/元组或得分者名称映射到可调用的多个度量标准。得到类型为0.3996569468267582

scikit-learn precision xgboost grid-search
1个回答
0
投票

RandomizedSearchCV希望将单个字符串或可调用的值用于单个度量标准,或者将字符串列表/元组或评分器名称的字典映射到用于多个度量值的可调用对象,作为“ scoring”参数,但是传递了浮点值jaccard_score(y_true=Y_miss_xgb_test, y_pred = preds_miss_xgb, average = 'micro')返回浮点分数(准确地为0.3996569468267582)。

您可以指定“ jaccard_score”得分作为字符串,如下所示:

random = RandomizedSearchCV(estimator = xg_class, 
                            param_distributions=param_distributions, 
                            scoring = "jaccard_score",
                            verbose = 1, 
                            n_jobs =-1, 
                            n_iter = 1000)
© www.soinside.com 2019 - 2024. All rights reserved.