CuDNNLSTM：未知错误：找不到 dnn 实现

Question

我已经成功运行了以 LSTM 作为第一层的模型。但出于好奇，我用 CuDNNLSTM 替换了 LSTM。但是model.fit之后，回复了如下错误信息：

UnknownError: Fail to find the dnn implementation.
    [[{{node cu_dnnlstm_5/CudnnRNN}} = CudnnRNN[T=DT_FLOAT, _class=["loc:@training_2/Adam/gradients/cu_dnnlstm_5/CudnnRNN_grad/CudnnRNNBackprop"], direction="unidirectional", dropout=0, input_mode="linear_input", is_training=true, rnn_mode="lstm", seed=87654321, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](cu_dnnlstm_5/transpose, cu_dnnlstm_5/ExpandDims_1, cu_dnnlstm_5/ExpandDims_1, cu_dnnlstm_5/concat_1)]]
    [[{{node metrics_3/mean_squared_error/Mean_1/_1877}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4852_metrics_3/mean_squared_error/Mean_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

我已经在这个讨论上尝试过TestCudnnLSTM()并成功通过测试：

Keras版本：2.2.4
张量流版本：1.12.0
创建模型
_________________________________________________________________
层（类型）输出形状参数#
=================================================== ===============
cu_dnnlstm_1（CuDNNLSTM）（无，1000，1）16
=================================================== ===============
总参数：16
可训练参数：16
不可训练参数：0
_________________________________________________________________
没有任何
模型编译

问题似乎是在模型拟合时出现的。但我不知道到底是什么问题？

Answer 1

对于 TensorFlow v2，一种解决方案是 -

import tensorflow as tf
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], enable=True)

那么你也可以使用keras模型 -

from tensorflow.keras.models import Model

文档

这个解决方案对我有用，它只支持一个 GPU 的内存增长。

Answer 2

如果您在安装 Keras NN 时遇到此错误，请将此代码放入您的导入中

from keras.backend.tensorflow_backend import set_session
import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
set_session(sess)

信用

Answer 3

确保您拥有适合您正在使用的 CUDA 版本的 Nvidia 驱动程序版本。你可以在这里查看。 https://docs.nvidia.com/deploy/cuda-compatibility/index.html#binary-compatibility

我使用的是 CUDA 9.0，但使用的 Nvidia 驱动程序低于 384.81。将 Nvidia 驱动程序更新到较新的驱动程序解决了我的问题。

Answer 4

当我将 Tensorflow 更新到 1.12 时，我遇到了同样的问题。将我的 CuDNN 版本从 7 更新到 7.5 后错误得到解决。我按照以下网址中提到的步骤更新 CuDNN 版本（注意：链接中提到的步骤用于安装 CUDNN ，但同样适用于更新)

https://jhui.github.io/2017/09/07/AWS-P2-CUDA-CuDNN-TensorFlow/

Answer 5

在tensorflow 2.0中，我在运行RNN LSTM模型时遇到了同样的错误。原因是我的cuDNN版本较低。在tensorflow gpu需求页面中，建议使用

cuDNN SDK >= 7.4.1.

您可以参考https://www.tensorflow.org/install/gpu

了解更多详细信息

在 Tensorflow Reddit 论坛中提问

https://www.reddit.com/r/tensorflow/comments/dxnnq2/i_am_getting_an_error_while_running_the_rnn_lstm/?utm_source=share&utm_medium=web2x

Answer 6

我建议检查是否有任何其他内核导入了tensorflow或keras。如果是，请关闭该内核 - 即使它不忙。它解决了我的问题。

Answer 7

我在虚拟环境中使用 conda 安装了tensorflow和keras，这解决了它。

conda install tensorflow
conda install keras

Answer 8

另请检查您的应用程序使用的 CUDA 版本是否存在 cuDNN。

升级tensorflow可能会导致它使用另一个CUDA版本

例如tensorflow-2.3使用CUDA 10.1，但tensorflow-2.5使用11.2

我在 Windows 中遇到了同样的错误，我必须将最新的 cuDNN DLL 复制到“c:\Program Files\NVIDIA GPU Computing Toolkit\CUDA 11.2”文件夹中

Answer 9

在检查以下软件包的所有版本后，我的代码有效： cuda、cudnn、tensorflow 和 gcc。大家需要找到对应的版本，希望对您有帮助！

我的版本如下：

库达11.1
Gcc-9
Cudnn-8.2
张量流-2.6
keras-2.6
python-3.6

Answer 10

对我来说，在为已安装的 CUDA 版本安装正确的 Tensorflow 版本后，问题得到解决。可以从这里看到正确的匹配：https://www.tensorflow.org/install/source#gpu。要检查计算机上安装的 CUDA 版本，请使用

nvcc --version

。

CuDNNLSTM：未知错误：找不到 dnn 实现

问题描述投票：0回答：10

10个回答

最新问题

CuDNNLSTM：未知错误：找不到 dnn 实现

问题描述 投票：0回答：10

10个回答

最新问题

问题描述投票：0回答：10