在Deepchem中使用GaussianProcessHyperparameterOpt

问题描述 投票:1回答:1

我正在尝试使用Deepchem的GaussianProcessHyperparamOpt中的hyperparam_search函数。我正在遵循HyperparamOpt类的测试脚本中所做的操作(没有对gaussianprocess类进行测试):

加载训练集等后,我为超参数定义了一个数字:

hps = {
      'layer_sizes': [1500],
      'weight_init_stddevs': [0.02],
      'bias_init_consts': [1.],
      'dropouts': [0.5],
      'penalty': 0.1,
      'penalty_type': 'l2',
      'batch_size': 50,
      'nb_epoch': 10,
      'learning_rate': 0.001  }

然后,我正在制作一个model_builder函数(这是我从测试脚本中复制/粘贴的::

def model_builder(model_params, model_dir):
       return dc.models.MultitaskRegressor(
           len(tasks), n_features, model_dir=model_dir, **model_params)

然后我定义度量,根据代码,该度量应以长度1的列表形式给出:

regression_metric = [dc.metrics.Metric(dc.metrics.r2_score)]

然后我调用hyperparameter_search函数,并提供四个必需的参数:

optimizer = dc.hyper.GaussianProcessHyperparamOpt(model_builder)
best_hyper_params, best_performance, all_results = optimizer.hyperparam_search(
    hps, 
    train,
    valid,
    transformers,
    regression_metric
)

并且我收到KeyValue错误:

---------------------------------------------------------------------------

KeyError                                  Traceback (most recent call last)

<ipython-input-85-be984f768f5b> in <module>()
      5     valid,
      6     transformers,
----> 7     metric = regression_metric
      8 )
      9 #    metric = regression_metric

1 frames

/usr/lib/python3.6/os.py in __getitem__(self, key)
    667         except KeyError:
    668             # raise KeyError with the original key value
--> 669             raise KeyError(key) from None
    670         return self.decodevalue(value)
    671 

KeyError: 'DEEPCHEM_DATA_DIR'

为了证明,我并不疯狂,这是该函数的代码,您可以看到它期望将度量作为第四个参数

 23   def hyperparam_search(
 24       self,
 25       params_dict,
 26       train_dataset,
 27       valid_dataset,
 28       output_transformers,
 29       metric,
 30       direction=True,
 31       n_features=1024,
 32       n_tasks=1,
 33       max_iter=20,
 34       search_range=4,
 35       hp_invalid_list=[
 36           'seed', 'nb_epoch', 'penalty_type', 'dropouts', 'bypass_dropouts',
 37           'n_pair_feat', 'fit_transformers', 'min_child_weight',
 38           'max_delta_step', 'subsample', 'colsample_bylevel',
 39           'colsample_bytree', 'reg_alpha', 'reg_lambda', 'scale_pos_weight',
 40           'base_score'
 41       ],
 42       log_file='GPhypersearch.log'):
 43     """Perform hyperparams search using a gaussian process assumption
 44 
 45     params_dict include single-valued parameters being optimized,
 46     which should only contain int, float and list of int(float)
 47 
 48     parameters with names in hp_invalid_list will not be changed.
 49 
 50     For Molnet models, self.model_class is model name in string,
 51     params_dict = dc.molnet.preset_hyper_parameters.hps[self.model_class]
 52 
 53     Parameters
 54     ----------
 55     params_dict: dict
 56       dict including parameters and their initial values
 57       parameters not suitable for optimization can be added to hp_invalid_list
 58     train_dataset: dc.data.Dataset struct
 59       dataset used for training
 60     valid_dataset: dc.data.Dataset struct
 61       dataset used for validation(optimization on valid scores)
 62     output_transformers: list of dc.trans.Transformer
 63       transformers for evaluation
 64     metric: list of dc.metrics.Metric
 65       metric used for evaluation

这里是时髦的:如果我将指标作为对象而不是作为包含对象的列表传递:

regression_metric = dc.metrics.Metric(dc.metrics.r2_score)

我没有收到KeyValue错误,但是当函数检查指标是否为长度1的列表时,它崩溃了。

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-87-be984f768f5b> in <module>()
      5     valid,
      6     transformers,
----> 7     metric = regression_metric
      8 )
      9 #    metric = regression_metric

/usr/local/lib/python3.7/site-packages/deepchem/hyper/gaussian_process.py in hyperparam_search(self, params_dict, train_dataset, valid_dataset, output_transformers, metric, direction, n_features, n_tasks, max_iter, search_range, hp_invalid_list, log_file)
     89     """
     90 
---> 91     assert len(metric) == 1, 'Only use one metric'
     92     hyper_parameters = params_dict
     93     hp_list = list(hyper_parameters.keys())

TypeError: object of type 'Metric' has no len()

任何Deepchem用户在那里,如果您已使此功能正常工作,请发出提示!

python keyerror
1个回答
0
投票

我们在gitter通道中进行了一些来回调试,结果发现存在一些导致此问题的潜在错误。这是记录修复程序的几个问题:

感谢您帮助解决这些问题! DeepChem的下一个稳定版本应已解决这些问题。

© www.soinside.com 2019 - 2024. All rights reserved.