检索和修改XGBoost权重

问题描述 投票:1回答:1

我正在使用xgboost库来训练二进制分类器。我想通过将噪声添加到权重(例如整体中树的叶子节点处的值)来防止训练算法产生数据泄漏。为此,我需要检索每棵树的权重并对其进行修改。

我可以在Booster对象上使用dump_modeltrees_to_dataframe来查看权重,我将其定义为

model = xgb.Booster(params, [dtrain])

后一种方法返回熊猫数据框

   Tree  Node    ID                          Feature  Split   Yes    No Missing        Gain     Cover
0      0     0   0-0                           tenure   17.0   0-1   0-2     0-1  671.161072  1595.500
1      0     1   0-1      InternetService_Fiber optic    1.0   0-3   0-4     0-3  343.489227   621.125
2      0     2   0-2      InternetService_Fiber optic    1.0   0-5   0-6     0-5  293.603149   974.375
3      0     3   0-3                           tenure    4.0   0-7   0-8     0-7   95.604340   333.750
4      0     4   0-4                     TotalCharges  120.0   0-9  0-10     0-9   27.897919   287.375
5      0     5   0-5                Contract_Two year    1.0  0-11  0-12    0-11   32.057739   512.625
6      0     6   0-6                           tenure   60.0  0-13  0-14    0-13  120.693176   461.750
7      0     7   0-7  TechSupport_No internet service    1.0  0-15  0-16    0-15   37.326447   149.750
8      0     8   0-8  TechSupport_No internet service    1.0  0-17  0-18    0-17   34.968536   184.000
9      0     9   0-9                  TechSupport_Yes    1.0  0-19  0-20    0-19    0.766754    65.500
10     0    10  0-10                MultipleLines_Yes    1.0  0-21  0-22    0-21   19.335510   221.875
11     0    11  0-11                 PhoneService_Yes    1.0  0-23  0-24    0-23   19.035950   281.125
12     0    12  0-12                             Leaf    NaN   NaN   NaN     NaN   -0.191398   231.500
13     0    13  0-13   PaymentMethod_Electronic check    1.0  0-25  0-26    0-25   43.379410   320.875
14     0    14  0-14                Contract_Two year    1.0  0-27  0-28    0-27   13.401367   140.875
15     0    15  0-15                             Leaf    NaN   NaN   NaN     NaN    0.050262    94.500
16     0    16  0-16                             Leaf    NaN   NaN   NaN     NaN   -0.052444    55.250
17     0    17  0-17                             Leaf    NaN   NaN   NaN     NaN   -0.058929   111.000
18     0    18  0-18                             Leaf    NaN   NaN   NaN     NaN   -0.148649    73.000
19     0    19  0-19                             Leaf    NaN   NaN   NaN     NaN    0.161464    63.875

其中叶值存储在列[[Gain]]中(叶节点是在列[[Feature中具有值Leaf的那些节点)”。因此,我可以在Gain列中的相应行上添加噪声,但是随后我不知道如何将Pandas数据帧转换回Booster对象/ XGBoost模型。我应该如何实现这一目标?还是有其他更好的方法来检索和修改XGBoost叶节点的值?

我正在使用xgboost库来训练二进制分类器。我想通过向权重添加噪声(例如...
python xgboost
1个回答
0
投票
© www.soinside.com 2019 - 2024. All rights reserved.