XGBOOST 模型预测,具有 nan 输入值

问题描述 投票:0回答:1

我在 xgboost 分类器中遇到了奇怪的行为。复制对this帖子

的回复中的代码
import xgboost as xgb
import numpy as np
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split

X, y = make_moons(noise=0.3, random_state=0)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1)

xgb_clf = xgb.XGBClassifier()
xgb_clf = xgb_clf.fit(X_train, y_train)

print(xgb_clf.predict(X_test))
print(xgb_clf.predict_proba(X_test))

>>[0 0 1 0 0 1 0 0 1 1]
[[0.97378635 0.02621362]
 [0.97106457 0.0289354 ]
 [0.45146966 0.54853034]
 [0.9181994  0.08180059]
 [0.97378635 0.02621362]
 [0.4264453  0.5735547 ]
 [0.6279408  0.37205923]
 [0.991474   0.00852604]
 [0.06204838 0.9379516 ]
 [0.08833408 0.9116659 ]]

到目前为止一切顺利。然而,当输入包含所有 nan 值时,即使模型也会做出预测。

b = np.empty([3,2])
b[:] = np.nan
xgb_clf.predict_proba(b)

>>array([[0.8939177 , 0.10608231],
   [0.8939177 , 0.10608231],
   [0.8939177 , 0.10608231]], dtype=float32)

这让我完全措手不及。我是否缺少一些参数,这可以使分类器预测输出也为nan

python classification xgboost
1个回答
0
投票

这是预期的行为。如果您想覆盖模型的预测,则必须手动执行此操作。

© www.soinside.com 2019 - 2024. All rights reserved.