如何将数据集传递给函数而不在Python函数中将其作为参数传递

问题描述 投票:0回答:1

我正在编写一个稍后将进行优化的函数,因此无法将数据作为参数传递给该函数。我对该函数的参数仅限于进行优化的参数。

我需要在函数中的某个地方传递数据,我想知道如何使用Global变量或类来做到这一点。当前,我正在读取函数中的“ dtrain”,这是不正确的,因为每次数据更新时,我都必须更新该函数。

如果我在脚本中编写该函数,那么它将正常工作。但我将其编写为模块,稍后将其导入到脚本中。

这是我的职能:


def bo_xgb_evaluate(learning_rate, subsample, colsample_bytree, gamma,
                    min_child_weight, max_depth, tweedie_variance_power):
    import numpy as np
    import xgboost as xgb
    xgbparams = {'eval_metric': 'tweedie-nloglik@' + str(np.round(tweedie_variance_power, 2)),
                 'objective': 'reg:tweedie',
                 'nthread': 4,
                 'learning_rate': learning_rate,
                 'max_depth': int(max_depth),
                 'subsample': max(min(subsample, 1), 0),
                 'colsample_bytree': max(min(colsample_bytree, 1), 0),
                 'gamma': gamma,
                 'min_child_weight': int(min_child_weight),
                 'seed': 1001}
    folds = 4
    print("\n Search parameters:\n %s" % (xgbparams))

    dtrain = xgb.DMatrix('/train.buffer')

    cv_result = xgb.cv(xgbparams,
                       dtrain,
                       num_boost_round=1000,
                       nfold=folds,
                       # stop the training when validation scores have not improved for 10 estimators
                       early_stopping_rounds=10,
                       metrics='tweedie-nloglik@' + str(np.round(tweedie_variance_power, 2)),
                       verbose_eval=5,
                       seed=1367)

    val_score = -1.0 * cv_result['test-tweedie-nloglik@' + str(np.round(tweedie_variance_power, 2)) + '-mean'].iloc[-1]
    train_score = -1.0 * cv_result['train-tweedie-nloglik@' + str(np.round(tweedie_variance_power, 2)) + '-mean'].iloc[
        -1]

    print('\n Stopped after %d iterations with train-deviance  = %f val-deviance = %f ( diff = %f )' %
          (len(cv_result), train_score, val_score, (train_score - val_score)))
    return val_score
python class global-variables
1个回答
0
投票

也许分解您的函数,所以您将有一个函数返回xgbparams,然后有一个函数返回dtrain,然后有一个函数需要xgbparamsdtrain进行计算。

def1():
    return xgbparams

def2():
    return dtrain

def3(xgbparams, dtrain);
    run the thing...

此外,通过将其拆分,您将有更好的机会弄清有效的方法。

否则,如果要创建一个类,则可以使用selfself.dtrain传递数据。

class myclass(object):
    import xgboost as xgb
    def __init__(self, data):
        self.dtrain = xgb.DMatrix(data)

    etc..

并且在初始化类时,只需将其定义为输入,然后将其导入:

myclass('/train.buffer')
© www.soinside.com 2019 - 2024. All rights reserved.