我无法将我的数据适当地适合XG Boost。更改数据类型无济于事。
有1225行和15列。
RangeIndex(start = 0,stop = 1225,step = 1)
其他分类算法工作正常,但输入此代码后,XG Boost给我下面的错误。
import xgboost as xgb
X_train, X_test, y_train, y_test = train_test_split(loans.index, loans.BAD, test_size=0.2, random_state=0)
train = xgb.DMatrix(X_train, label=y_train)
test = xgb.DMatrix(X_test, label=y_test)
param = {'max_depth':2, 'eta':1, 'objective':'binary:logistic' }
num_round = 2bst = xgb.train(param,X_train,10)
---------------------------------------------------------------------------
`TypeError Traceback (most recent call last)
<ipython-input-117-378a1a19d4c9> in <module>
1 param = {'max_depth':2, 'eta':1, 'objective':'binary:logistic' }
2 num_round = 2
----> 3 bst = xgb.train(param, X_train, 10)
~\Anaconda3\lib\site-packages\xgboost\training.py in train(params, dtrain, num_boost_round, evals, obj, feval, maximize, early_stopping_rounds, evals_result, verbose_eval, xgb_model, callbacks)
207 evals=evals,
208 obj=obj, feval=feval,
--> 209 xgb_model=xgb_model, callbacks=callbacks)
210
211
~\Anaconda3\lib\site-packages\xgboost\training.py in _train_internal(params, dtrain, num_boost_round, evals, obj, feval, xgb_model, callbacks)
28 params += [('eval_metric', eval_metric)]
29
---> 30 bst = Booster(params, [dtrain] + [d[0] for d in evals])
31 nboost = 0
32 num_parallel_tree = 1
~\Anaconda3\lib\site-packages\xgboost\core.py in __init__(self, params, cache, model_file)
1026 for d in cache:
1027 if not isinstance(d, DMatrix):
-> 1028 raise TypeError('invalid cache item: {}'.format(type(d).__name__), cache)
1029 self._validate_features(d)
1030
TypeError: ('invalid cache item: Int64Index', [Int64Index([ 359, 745, 682, 903, 548, 906, 1040, 467, 85, 192,
...
600, 1094, 599, 277, 1033, 763, 835, 1216, 559, 684],
dtype='int64', length=980)])
[使用Learning API时,xgboost.train
需要火车xgboost.train
,而您正在喂食DMatrix
。您应该使用:
X_train