我想知道特征对我的数据的重要性,所以我使用 permutation_importance。当我得到结果时,似乎该功能已经解码,我想使用
get_features_name_out
知道我的功能的名称。它变成了一个错误'StandardScaler' object has no attribute 'get_feature_names_out'
。如果我尝试手动解释,恐怕顺序会出现错误。顺序应该是(3,0,1,2)。吸烟者、年龄、体重指数、性别。
这是代码
import numpy as np
import pandas as pd
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.neighbors import KNeighborsRegressor
from sklearn.model_selection import train_test_split
from sklearn.inspection import permutation_importance
# Prepare data
X = df[['age', 'bmi', 'sex', 'smoker']]
y = df['charges']
# Define the preprocessor
categorical_transformer = OneHotEncoder(drop='first', sparse=False)
numerical_transformer = StandardScaler()
preprocessor = ColumnTransformer(
transformers=[
('num', numerical_transformer, ['age', 'bmi']),
('cat', categorical_transformer, ['sex', 'smoker'])
]
)
# Preprocess the data
X_preprocessed = preprocessor.fit_transform(X)
# Extract feature names
num_features = numerical_transformer.get_feature_names_out(['age', 'bmi'])
cat_features = categorical_transformer.get_feature_names_out(['sex', 'smoker'])
feature_names = np.concatenate([num_features, cat_features])
# Split the data
X_train, X_test, y_train, y_test = train_test_split(X_preprocessed, y, test_size=0.2, random_state=42)
# Train KNeighborsRegressor
knn_regressor = KNeighborsRegressor()
reg_model = knn_regressor.fit(X_train, y_train)
# Evaluate feature importance using permutation importance
results = permutation_importance(knn_regressor, X_test, y_test, n_repeats=10, random_state=42, scoring='neg_mean_squared_error')
# Display feature importances with names
for i, importance in enumerate(results.importances_mean):
print(f"Feature '{feature_names[i]}': Importance: {importance}")
sorted_indices = np.argsort(results.importances_mean)
for i in sorted_indices[::-1]:
print(f"Feature '{feature_names[i]}', Importance: {results.importances_mean[i]}")
我想知道功能名称。也许可以解释为什么特征重要性的顺序不正确,因为我在费用与每个特征之间手动绘制了图,正确的顺序应该是吸烟者、年龄、体重指数、性别。
它不适用于提取器,因为您使用了预处理器(ColumnTransformer)来拟合和转换。您可以通过在 ColumnTransformer 中指定步骤来获取它们:
preprocessor["cat"].get_feature_names_out()