如何将超参数作为 For 循环中的单个变量传递给 RandomForestRegressor

问题描述 投票:0回答:2

我正在尝试使用 for 循环将不同的超参数发送到 RandomForestRegressor。

我使用下面的代码创建超参数(列表?数组?)来控制循环。当我尝试拟合模型时,我不断收到代码后面列出的错误。

我想做的事情可能吗?如果可能的话,我会怎么做?

    hyperparams = [{
                    'n_estimators':460,
                    'bootstrap':False,
                    'criterion':'poisson',
                    'max_depth':60,
                    'max_features':2,
                    'min_samples_leaf':1,
                    'min_samples_split':2
                },
                {
                    'n_estimators':60,
                    'bootstrap':False,
                    'criterion':'friedman_mse',
                    'max_depth':90,
                    'max_features':3,
                    'min_samples_leaf':1,
                    'min_samples_split':2
                }]
    for hparams in hyperparams:
        model_regressor = RandomForestRegressor(hparams)
        print(model_regressor.get_params())
        print(model_regressor.get_params())

        total_r2_score_value = 0
        total_mean_squared_error_array = 0

        total_explained_variance_score_value = 0
        total_max_error_value = 0
        total_mean_absolute_error_value = 0
        total_mean_absolute_percent_value = 0
        total_median_absolute_error_value = 0
        total_mean_tweedie_deviance_value = 0
        total_mean_pinball_loss_value = 0
        total_d2_pinball_score_value = 0
        total_d2_absolute_error_score_value = 0
        
        total_tests = 10
        for index in range(1, total_tests+1):
            
            # model fitting
            model_regressor.fit(X_train, y_train)```



ERROR:

```Traceback (most recent call last):
  File "c:\Projects\Python\DATA260\data_260_python\src\DATA_280A_Course\src\week6_project_work\jess_obesity_dataset_optimized_RFR.py", line 283, in <module>      
    main()
  File "c:\Projects\Python\DATA260\data_260_python\src\DATA_280A_Course\src\week6_project_work\jess_obesity_dataset_optimized_RFR.py", line 210, in main
    model_regressor.fit(X_train, y_train)
  File "C:\Users\Jess\AppData\Local\Programs\Python\Python311\Lib\site-packages\sklearn\base.py", line 1144, in wrapper
    estimator._validate_params()
  File "C:\Users\Jess\AppData\Local\Programs\Python\Python311\Lib\site-packages\sklearn\base.py", line 637, in _validate_params
    validate_parameter_constraints(
  File "C:\Users\Jess\AppData\Local\Programs\Python\Python311\Lib\site-packages\sklearn\utils\_param_validation.py", line 95, in validate_parameter_constraints   
    raise InvalidParameterError(
sklearn.utils._param_validation.InvalidParameterError: The 'n_estimators' parameter of RandomForestRegressor must be an int in the range [1, inf). Got {'n_estimators': 460, 'bootstrap': False, 'criterion': 'poisson', 'max_depth': 60, 'max_features': 2, 'min_samples_leaf': 1, 'min_samples_split': 2} instead.```

python scikit-learn parameter-passing random-forest
2个回答
2
投票

将超参数字典传递给构造函数时,您应该“解压”它:

model_regressor = RandomForestRegressor(**hparams)

否则,根据文档,它会尝试将 n_estimators 设置为您作为第一个参数传递的任何内容。


0
投票

我认为您正在循环 json 对象,如本例所示:

from sklearn.ensemble import RandomForestRegressor

hyperparams = [{
    'n_estimators': 460,
    'bootstrap': False,
    'criterion': 'poisson',
    'max_depth': 60,
    'max_features': 2,
    'min_samples_leaf': 1,
    'min_samples_split': 2
},
    {
        'n_estimators': 60,
        'bootstrap': False,
        'criterion': 'friedman_mse',
        'max_depth': 90,
        'max_features': 3,
        'min_samples_leaf': 1,
        'min_samples_split': 2
    }]
for hparams in hyperparams:
    model_regressor = RandomForestRegressor(hparams)
    print(model_regressor.get_params())


{'bootstrap': True, 'ccp_alpha': 0.0, 'criterion': 'squared_error', 'max_depth': None, 'max_features': 1.0, 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'n_estimators': {'n_estimators': 460, 'bootstrap': False, 'criterion': 'poisson', 'max_depth': 60, 'max_features': 2, 'min_samples_leaf': 1, 'min_samples_split': 2}, 'n_jobs': None, 'oob_score': False, 'random_state': None, 'verbose': 0, 'warm_start': False}
{'bootstrap': True, 'ccp_alpha': 0.0, 'criterion': 'squared_error', 'max_depth': None, 'max_features': 1.0, 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'n_estimators': {'n_estimators': 460, 'bootstrap': False, 'criterion': 'poisson', 'max_depth': 60, 'max_features': 2, 'min_samples_leaf': 1, 'min_samples_split': 2}, 'n_jobs': None, 'oob_score': False, 'random_state': None, 'verbose': 0, 'warm_start': False}
{'bootstrap': True, 'ccp_alpha': 0.0, 'criterion': 'squared_error', 'max_depth': None, 'max_features': 1.0, 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'n_estimators': {'n_estimators': 60, 'bootstrap': False, 'criterion': 'friedman_mse', 'max_depth': 90, 'max_features': 3, 'min_samples_leaf': 1, 'min_samples_split': 2}, 'n_jobs': None, 'oob_score': False, 'random_state': None, 'verbose': 0, 'warm_start': False}
{'bootstrap': True, 'ccp_alpha': 0.0, 'criterion': 'squared_error', 'max_depth': None, 'max_features': 1.0, 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'n_estimators': {'n_estimators': 60, 'bootstrap': False, 'criterion': 'friedman_mse', 'max_depth': 90, 'max_features': 3, 'min_samples_leaf': 1, 'min_samples_split': 2}, 'n_jobs': None, 'oob_score': False, 'random_state': None, 'verbose': 0, 'warm_start': False}

在跟踪中,键“n_estimators”与 JSON 对象关联...... 而另一个键还没有与任何值关联。

看看这段代码,列表中可能有一个元素:

hyperparams = [{
    'n_estimators': 460,
    'criterion': 'poisson',
    'bootstrap': False,
    'max_depth': 60,
    'max_features': 2,
    'min_samples_leaf': 1,
    'min_samples_split': 2
},
   ]
for hparams in hyperparams:
    model_regressor = RandomForestRegressor(hparams['n_estimators'], criterion=hparams['criterion'])
    print(model_regressor.get_params())

{'bootstrap': True, 'ccp_alpha': 0.0, 'criterion': 'poisson', 'max_depth': None, 'max_features': 1.0, 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'n_estimators': 460, 'n_jobs': None, 'oob_score': False, 'random_state': None, 'verbose': 0, 'warm_start': False}

您必须在文档中查找 RandomForestRegressor 的参数。第一个是 n_estimators,其他一个是可选的...... 玩得开心

如果您想要所有参数,最好解压 JSON 对象:

model_regressor = RandomForestRegressor(**hparams)
© www.soinside.com 2019 - 2024. All rights reserved.