让我们看看以下管道:
scaler = ScalerFactory.get_scaler(scaler_type)
model = MultiOutputRegressor(lgb.LGBMRegressor(metric='tweedie', **hyperparameters))
steps = [('scaler', scaler), ('model', model)]
pipeline = Pipeline(steps)
pipeline.fit(X, y, model__feature_name=list(X.columns))
我正在尝试向管道添加另一个步骤,因此当它预测时,它会将 -1 和 1 之间的所有值四舍五入到 0。
我正在尝试创建一个新课程:
from numpy.random import randint
from sklearn.base import BaseEstimator, TransformerMixin
class OutputClipper(BaseEstimator, TransformerMixin):
def __init__(self) -> None:
super().__init__()
self.clipping = False
def fit(self, X, y=None):
return self
def transform(self, X, y):
y[(y>-1) & (y<1)] = 0
return y
新管道变为:
steps = [('scaler', scaler), ('model', model), ('clipping',OutputClipper) ]
pipeline = Pipeline(steps)
但是,我觉得这不太有效。我猜想当使用
.predict()
方法调用管道时会发生转换。我也不知道如何测试。
事实上,管道仅用于预处理数据。要对预测进行后处理,您可以将估计器包装到元估计器中。
例如,
from sklearn.base import MetaEstimatorMixin, clone
class OutputClipper(MetaEstimatorMixin):
def __init__(self, estimator):
self.estimator = estimator
def fit(self, X, y, **kwargs):
self.estimator_ = clone(self.estimator)
self.estimator_.fit(X, y, **kwargs)
return self
def predict(self, X, **kwargs):
y = self.estimator_.predict(X, **kwargs)
y[(y>-1) & (y<1)] = 0
return y
# And then your code...
scaler = ScalerFactory.get_scaler(scaler_type)
model = MultiOutputRegressor(lgb.LGBMRegressor(metric='tweedie', **hyperparameters))
model = OutputClipper(model)
model.fit(X, y, feature_name=list(X.columns))