我想从我的GradientBoostingRegressor
的
第一个估计器打印特征的名称,但出现以下错误。 Scikit_learn版本=
1.2.2
model.estimators_[0]._final_estimator.feature_names_in_
output:
AttributeError Traceback (most recent call last)
Cell In[115], line 1
----> 1 model.estimators_[0]._final_estimator.feature_names_in_
AttributeError: 'GradientBoostingRegressor' object has no attribute 'feature_names_in_'
您写道,您想要专门获取集成的first估计器的特征名称。不幸的是,各个树的特征名称未存储。这就是为什么它会给你错误
AttributeError: 'GradientBoostingRegressor' object has no attribute 'feature_names_in_'
但是,由于它们是在与整个模型相同的特征集上进行训练的,因此主
GradientBoostingRegressor
中的特征名称可用于其每个决策树。因此,您可以像这样提取集合的特征名称(因此可用于第一棵树):
model.feature_names_in_
如果您对第一棵树使用的功能名称感兴趣,您可以这样做:
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.datasets import fetch_california_housing
import pandas as pd
# Load the dataset
data = fetch_california_housing()
X, y = data.data, data.target
feature_names = data.feature_names
# Create and fit the GradientBoostingRegressor
model = GradientBoostingRegressor(max_features=0.5, random_state=0)
model.fit(X, y) # Directly fit on X, y without converting to DataFrame
# Access the first tree of the first estimator
first_tree = model.estimators_[0, 0]
# Get the feature indices used in the first tree and filter out non-features
used_feature_indices = set([i for i in first_tree.tree_.feature if i >= 0])
# Map indices to feature names
used_feature_names = [feature_names[i] for i in used_feature_indices]
print("All feature names:", feature_names)
print("Names of features used in the first tree:", used_feature_names)
print("Names of features not used in the first tree:", set(feature_names) - set(used_feature_names))