我正在尝试按照文档使用 dask 训练 xgboost 模型。我的问题是我在必须创建 DaskDMatrix 的步骤中被阻止。无论我如何尝试,我都会收到一条错误消息,表明该方法不存在。我试过:
import dask
import dask_xgboost as dxgb
client = dask.distributed.Client()
params = {'objective':'binary:logistic',
'booster': 'dart',
'n_estimators': 800,
'max_depth': 4,
'learning_rate': 0.02,
'random_state': 42}
}
dtrain = dxgb.DaskDMatrix(client, X_train, y_train)
dval = dxgb.DaskDMatrix(client, X_val, y_val)
eval_set = [(dtrain, 'train'), (dval, 'validation')]
model = dxgb.train(client, params, dtrain, num_boost_round=10, evals=eval_set, eval_metric=['logloss', 'aucpr'], verbose=True)
==>“属性错误:模块‘dask_xgboost’没有属性‘DaskDMatrix’”
然后我尝试了:
import xgboost as xgb
# Create a Dask DMatrix from a Dask DataFrame
dtrain = xgb.dask.DaskDMatrix(client, X_train, y_train)
==> 属性错误:模块“xgboost”没有属性“dask”
还有:
import dask_ml.xgboost as dxgb
# Create a Dask DMatrix from a Dask DataFrame
dtrain = dxgb.DaskDMatrix(client, X_train, y_train)
==> 属性错误:模块“dask_ml.xgboost”没有属性“DaskDMatrix”
在哪里可以找到正确的代码?
在我的笔记本电脑上使用新安装的 conda xgboost 和 dask-xgboost 运行代码,并遵循 文档:
import xgboost as xgb
import dask.array as da
import dask.distributed
client = dask.distributed.Client()
params = {'objective':'binary:logistic',
'booster': 'dart',
'n_estimators': 800,
'max_depth': 4,
'learning_rate': 0.02,
'random_state': 42}
X_train = da.random.random((10,100000))
y_train = da.random.random((1,100000))
dtrain = xgb.dask.DaskDMatrix(client, X_train, y_train)