我正在尝试使用CuDNNLSTM进行深度学习,并在正式文档中找到了安装它的步骤。因此,我创建了一个新的Pycharm项目,并仅添加了tf-gpu库。代码运行速度提高了5倍。
但是当我在Jupyter Notebook上运行相同的代码时,它显示出错误
我尝试测试的代码是一个非常简单的MNIST(大部分步骤已跳过)。
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
tf.config.experimental.list_physical_devices('GPU')
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, LSTM
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train / 255.0
x_test = x_test / 255.0
model = Sequential()
model.add(LSTM(128, input_shape=(x_train.shape[1:]), return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(128))
model.add(Dropout(0.2))
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))
opt = tf.keras.optimizers.Adam(lr=1e-3, decay=1e-5)
model.compile(loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=3, validation_data=(x_test, y_test))
错误:
UnknownError: [_Derived_] Fail to find the dnn implementation.
[[{{node CudnnRNN}}]]
[[sequential_1/lstm_2/StatefulPartitionedCall]] [Op:__inference_distributed_function_10295]
Function call stack:
distributed_function -> distributed_function -> distributed_function
我自己解决了这个问题。我注意到我已经从cmd安装了pip的tensorflow和tensorflow-gpu。我都卸载了这两个版本,并且仅安装了tensorflow-gpu版本。它也可以在Jupyter Notebook上正常运行!