剪辑 sklearn 管道预测的输出

问题描述 投票:0回答:1

让我们看看以下管道:

scaler = ScalerFactory.get_scaler(scaler_type)
model = MultiOutputRegressor(lgb.LGBMRegressor(metric='tweedie', **hyperparameters))
steps = [('scaler', scaler), ('model', model)]
pipeline = Pipeline(steps)
pipeline.fit(X, y, model__feature_name=list(X.columns))

我正在尝试向管道添加另一个步骤,因此当它预测时,它会将 -1 和 1 之间的所有值四舍五入到 0。

我正在尝试创建一个新课程:

from numpy.random import randint
from sklearn.base import BaseEstimator, TransformerMixin


class OutputClipper(BaseEstimator, TransformerMixin):
    def __init__(self) -> None:
        super().__init__()
        self.clipping = False

    def fit(self, X, y=None):
        return self
    
    def transform(self, X, y):
        y[(y>-1) & (y<1)] = 0
        return y

新管道变为:

steps = [('scaler', scaler), ('model', model), ('clipping',OutputClipper) ]
pipeline = Pipeline(steps)

但是,我觉得这不太有效。我猜想当使用

.predict()
方法调用管道时会发生转换。我也不知道如何测试。

python python-3.x scikit-learn pipeline scikit-learn-pipeline
1个回答
0
投票

事实上,管道仅用于预处理数据。要对预测进行后处理,您可以将估计器包装到元估计器中。

例如,

from sklearn.base import MetaEstimatorMixin, clone


class OutputClipper(MetaEstimatorMixin):
    def __init__(self, estimator):
        self.estimator = estimator

    def fit(self, X, y, **kwargs):
        self.estimator_ = clone(self.estimator)
        self.estimator_.fit(X, y, **kwargs)
        return self

    def predict(self, X, **kwargs):
        y = self.estimator_.predict(X, **kwargs)
        y[(y>-1) & (y<1)] = 0
        return y


# And then your code...
scaler = ScalerFactory.get_scaler(scaler_type)
model = MultiOutputRegressor(lgb.LGBMRegressor(metric='tweedie', **hyperparameters))
model = OutputClipper(model)
model.fit(X, y, feature_name=list(X.columns))
© www.soinside.com 2019 - 2024. All rights reserved.