我正在按照 TensorFlow 官方网站这里训练基本的 LSTM 进行文本预测。我已成功在 GTX 1050ti 上训练模型最多 40 个 epoch,并将 checkPoint 文件保存在单独的文件夹中。然而,当我现在尝试恢复模型时,我收到了这个很长的错误:-
StreamExecutor device (0): GeForce GTX 1050 Ti, Compute Capability 6.1
WARNING:tensorflow:Entity <function standard_gru at 0x7f9e121324d0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <function standard_gru at 0x7f9e121324d0>: AttributeError: module 'gast' has no attribute 'Num'
WARNING:tensorflow:Entity <function cudnn_gru at 0x7f9e120c1d40> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <function cudnn_gru at 0x7f9e120c1d40>: AttributeError: module 'gast' has no attribute 'Num'
WARNING:tensorflow:Entity <function standard_gru at 0x7f9e121324d0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <function standard_gru at 0x7f9e121324d0>: AttributeError: module 'gast' has no attribute 'Num'
WARNING:tensorflow:Entity <function cudnn_gru at 0x7f9e120c1d40> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <function cudnn_gru at 0x7f9e120c1d40>: AttributeError: module 'gast' has no attribute 'Num'
WARNING:tensorflow:From /home/awesome_ruler/.local/lib/python3.7/site-packages/tensorflow/python/training/tracking/util.py:1200: NameBasedSaverStatus.__init__ (from tensorflow.python.training.tracking.util) is deprecated and will be removed in a future version.
Instructions for updating:
Restoring a name-based tf.train.Saver checkpoint using the object-based restore API. This mode uses global names to match variables, and so is somewhat fragile. It also adds new restore ops to the graph each time it is called when graph building. Prefer re-encoding training checkpoints in the object-based format: run save() on the object-based saver (the same one this message is coming from) and use that checkpoint in the future.
Traceback (most recent call last):
File "main.py", line 95, in <module>
model.load_weights(checkpoint_dir)
File "/home/awesome_ruler/.local/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 162, in load_weights
return super(Model, self).load_weights(filepath, by_name)
File "/home/awesome_ruler/.local/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 1398, in load_weights
status.assert_nontrivial_match()
File "/home/awesome_ruler/.local/lib/python3.7/site-packages/tensorflow/python/training/tracking/util.py", line 917, in assert_nontrivial_match
return self.assert_consumed()
File "/home/awesome_ruler/.local/lib/python3.7/site-packages/tensorflow/python/training/tracking/util.py", line 894, in assert_consumed
(unused_attributes,))
AssertionError: Some objects had attributes which were not restored: {<tf.Variable 'embedding_1/embeddings:0' shape=(65, 256) dtype=float32, numpy=
array([[-0.00044268, -0.02351714, -0.01139065, ..., -0.00327835,
0.00074228, -0.00383734],
[-0.02313181, 0.04697707, -0.02350216, ..., 0.040385 ,
0.03087702, 0.02765551],
[ 0.0410727 , 0.00130001, 0.0051438 , ..., 0.02899202,
0.04258115, -0.03773504],
...,
[-0.03134514, 0.01370119, 0.00993627, ..., -0.02257681,
0.02617678, 0.03761976],
[-0.02954974, 0.02407967, 0.02768463, ..., -0.0056519 ,
-0.01507735, 0.04617763],
[-0.04113789, -0.03544737, 0.01056757, ..., 0.01236727,
-0.01791535, -0.01635399]], dtype=float32)>: ['embedding_1/embeddings'], <tf.Variable 'dense_1/kernel:0' shape=(1024, 65) dtype=float32, numpy=
array([[-6.7811467e-02, -2.5536597e-02, 5.1763237e-02, ...,
-6.9665730e-02, 3.9457709e-02, -5.3290475e-02],
[ 1.5835620e-02, -3.0763537e-02, -7.4058644e-02, ...,
3.8087368e-05, -9.1508478e-03, 5.5485427e-02],
[ 3.8143486e-02, 8.8131428e-04, -2.3478847e-02, ...,
-1.5135627e-02, -5.2146181e-02, 7.1185097e-02],
...,
[-6.6591002e-02, 4.7627889e-02, 5.7474524e-02, ...,
4.1528463e-02, 4.6467118e-02, -3.0670539e-02],
[-5.0804108e-02, 5.4505378e-02, -1.5776977e-03, ...,
2.1875933e-02, -2.9637258e-02, 2.0201296e-02],
[-4.7325939e-02, -8.0013275e-03, -3.6348965e-02, ...,
-7.0560835e-02, -4.9752403e-02, 1.0509960e-02]], dtype=float32)>: ['dense_1/kernel'], <tf.Variable 'dense_1/bias:0' shape=(65,) dtype=float32, numpy=
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
dtype=float32)>: ['dense_1/bias'], <tf.Variable 'gru_1/kernel:0' shape=(256, 3072) dtype=float32, numpy=
array([[ 0.00432818, 0.03131782, 0.00038544, ..., -0.00559966,
0.03458985, -0.03219106],
[-0.00865119, 0.01648769, -0.00768028, ..., 0.01366192,
-0.03043955, -0.01382086],
[-0.01379537, 0.00547716, -0.00385967, ..., -0.00027269,
-0.01285852, 0.0377048 ],
...,
[-0.01940641, 0.01454895, 0.03349226, ..., -0.04234404,
-0.02699661, 0.0376601 ],
[ 0.00186675, -0.00547577, -0.02205843, ..., -0.01287581,
-0.02314153, 0.04158166],
[ 0.00954719, -0.02883693, -0.03259185, ..., -0.02587803,
0.02906795, -0.00559821]], dtype=float32)>: ['gru_1/kernel'], <tf.Variable 'gru_1/recurrent_kernel:0' shape=(1024, 3072) dtype=float32, numpy=
array([[ 9.11542401e-03, 1.50135346e-02, 2.96630897e-02, ...,
2.25223936e-02, 2.31253020e-02, -2.96920985e-02],
[-2.21075956e-02, -8.46013427e-06, -2.16848943e-02, ...,
-1.26914177e-02, -3.49153839e-02, -3.01396102e-02],
[-3.59148793e-02, 9.98445973e-03, 2.60963626e-02, ...,
3.15430500e-02, 1.28889643e-02, 3.37569825e-02],
...,
[ 3.39106433e-02, 6.54980540e-03, -1.27352085e-02, ...,
-4.14674729e-03, 3.53236459e-02, -1.36333425e-02],
[-3.50691415e-02, -1.76392253e-02, 1.67468414e-02, ...,
-2.06982102e-02, -1.06042419e-02, 2.26641595e-02],
[-1.14825107e-02, -3.46554294e-02, -1.83847174e-03, ...,
2.25809850e-02, 2.45791934e-02, -2.70933360e-02]], dtype=float32)>: ['gru_1/recurrent_kernel'], <tf.Variable 'gru_1/bias:0' shape=(2, 3072) dtype=float32, numpy=
array([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]], dtype=float32)>: ['gru_1/bias']}
我正在尝试加载文件
ckpt_40.index
,正如您所看到的,这是最新的检查点。但是我不能。我正在使用此代码加载我的模型==>
checkpoint_dir = 'CheckPoints/ckpt_40.index'
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
model.load_weights(checkpoint_dir)
model.summary()
我正在使用网站上的
generate_text
功能来预测某些事情。
我认为类似的问题已发布于on Stack Overflow here,但没有得到解答。我正在使用 Tf[GPU] 2.0- beta1,这是 GPU 的最新 tf 版本...
我犯了一个非常愚蠢的错误,这个错误是如此之小,以至于我怀疑有人能发现它。在这一行中:-
checkpoint_dir = 'CheckPoints/ckpt_40.index'
虽然文件被命名为具有“.index”前缀,但由于某种原因将该扩展名附加到变量/调用函数导致它因某种原因而恐慌(可能是一个错误)。更有用的是指出错误扩展名的错误。
因此,对于遇到此问题的其他人,只需将检查点目录更改为:
# .index has been removed
checkpoint_dir = 'CheckPoints/ckpt_40'