我正在 Ubuntu 22.04.3 上的新配置文件上重新进行 CUDA/CuDNN 设置,因为 Keras 引发了一个我无法修复的 libcublasLT 错误。为了项目兼容性,我需要将 TF 12.12 与 CUDA 11.8 和 CuDNN 8.6 一起使用。我正在一台已经安装了多个 CUDA 版本的机器上工作。
$ cd /usr/local/
表演
cuda cuda-11.8 cuda-12 cuda-12.0 cuda-12.3
现在我在 CuDNN 安装的步骤 2.4.3 中进行 CuDNN 验证,它返回了以下输出:
rm -rf *o
rm -rf mnistCUDNN
CUDA_VERSION is 12030
Linking agains cublasLt = true
CUDA VERSION: 12030
TARGET ARCH: x86_64
HOST_ARCH: x86_64
TARGET OS: linux
SMS: 35 50 53 60 61 62 70 72 75 80 86 87
test.c:1:10: fatal error: FreeImage.h: No such file or directory
1 | #include "FreeImage.h"
| ^~~~~~~~~~~~~
compilation terminated.
>>> WARNING - FreeImage is not set up correctly. Please ensure FreeImage is set up correctly. <<<
[@] /usr/local/cuda/bin/nvcc -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_87,code=sm_87 -gencode arch=compute_87,code=compute_87 -o fp16_dev.o -c fp16_dev.cu
[@] g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -o fp16_emu.o -c fp16_emu.cpp
[@] g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -o mnistCUDNN.o -c mnistCUDNN.cpp
[@] /usr/local/cuda/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_87,code=sm_87 -gencode arch=compute_87,code=compute_87 -o mnistCUDNN fp16_dev.o fp16_emu.o mnistCUDNN.o -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -lcublasLt -LFreeImage/lib/linux/x86_64 -LFreeImage/lib/linux -lcudart -lcublas -lcudnn -lfreeimage -lstdc++ -lm
值得注意的是,它说 CUDA 版本:12030,我假设是 12.3,这很奇怪,因为
$ nvcc --version
表演
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
我的 bashrc 包含以下几行:
export PATH=/usr/local/cuda-11.8/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATH
现在我想知道
Could not load library libcublasLt.so.10. Error: libcublasLt.so.10: cannot open shared object file: No such file or directory
(这是我尝试重新进行整个设置的最初问题)
尝试这个,但修改以适应你自己的路径
# Copy header files
188 sudo cp /home/adesoji/cudnn-linux-x86_64-8.9.6.50_cuda12-archive/include/cudnn*.h /usr/lib/cuda/include/
189 sudo cp /home/adesoji/cudnn-linux-x86_64-8.9.6.50_cuda12-archive/lib64/libcudnn* /usr/lib/cuda/lib64/
190 sudo cp /home/adesoji/cudnn-linux-x86_64-8.9.6.50_cuda12-archive/include/cudnn*.h /usr/lib/cuda/include/
191 sudo cp /home/adesoji/cudnn-linux-x86_64-8.9.6.50_cuda12-archive/lib/libcudnn* /usr/lib/cuda/lib64/
192 sudo chmod a+r /usr/lib/cuda/include/cudnn.h /usr/lib/cuda/lib64/libcudnn*
从 nvidia 网站安装 cuda 12.0
https://developer.nvidia.com/cuda-12-0-0-download-archive
从下面的 APT 下载 libcublas 运行
sudo apt install libcublas-12-0
现在转到终端并运行
ls /usr/local | grep cuda : To know your cuda path
export PATH=/depot/cuda/cuda-12.0/bin:$PATH
export PATH=/depot/cuda/cuda-11.8/bin:$PATH
你应该对这些没问题。我的Python版本和tensorflow版本在下面找到
Python 3.8.10 (default, Nov 22 2023, 10:22:35)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2023-12-05 19:44:58.346827: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
>>> print (tf.__version__)
2.13.1
>>> import keras as ks
>>> print (ks.__version__)
2.13.1
>>>
我的 keras 和 tensorflow 与 cuda 工作得很好