Tensorflow GPU 支持在 ubuntu 上不起作用

Question

我正在尝试在我的 GPU 上运行 keras。

我的设置：

NVIDIA Geforce RTX 3070
Ubuntu 22.04
Python：3.10

我通过

sudo ubuntu-drivers install

安装了nvdidia驱动程序。在“软件和更新/附加驱动程序”下，它表示它使用 nvidia-driver535。所以它有一个驱动程序。

然后我通过

sudo apt-get install nvidia-cuda-dev  nvidia-cuda-toolkit

安装了cuda工具包。我还通过

sudo apt install nvidia-cudnn

和 tensorflow

pip install tensorflow

安装了 cuDNN，其中也已经包含了 keras。

但是，当通过实际的张量流库列出物理设备时，它只列出 CPU。

print(tf.config.list_physical_devices())
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]

导入tensorflow时打印如下：

2024-06-26 23:15:15.129300: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-26 23:15:15.131933: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-26 23:15:15.170793: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-26 23:15:15.699070: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-06-26 23:15:16.077326: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-06-26 23:15:16.081814: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2251] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...

似乎没有找到 cuda 驱动程序，并且缺少“TensorRT”。

这是一个全新的 ubuntu 安装，我还没有安装任何其他 python 软件包。

我该怎么做才能让这项工作成功？

Answer 1

经过几次尝试，上面列表中的一个版本组合我可以开始工作。对于任何感兴趣的人，以下是我为运行 TensorFlow GPU 支持所做的确切安装过程：

先决条件：

需要：Ubuntu 20.04 或 Ubuntu 22.04

稍后构建 Colde 的“make”工具：

sudo apt-get install build-essential

所需的软件包版本

https://www.tensorflow.org/install/source#gpu

就我而言：

张量流2.15.0
Python 3.9-3.11
铿锵16.0.0
巴泽尔6.1.0
cuDNN 8.9
CUDA 12.2

第 1 步：NVIDIA 驱动程序

sudo ubuntu-drivers list
sudo ubuntu-drivers install

STEP2：重启显卡驱动即可生效

第三步：CUDA 12.2

https://developer.nvidia.com/cuda-12-2-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_local

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda-repo-ubuntu2204-12-2-local_12.2.0-535.54.03-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-12-2-local_12.2.0-535.54.03-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-2-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda

第 4 步：CUDA 12.x 的 cuDNN 8.9.5

https://docs.nvidia.com/deeplearning/cudnn/archives/cudnn-895/install-guide/index.html https://docs.nvidia.com/deeplearning/cudnn/archives/cudnn-895/install-guide/index.html#installlinux-deb

下载：

https://developer.nvidia.com/rdp/cudnn-archive

“下载 cuDNN v8.9.5（2023 年 10 月 27 日），适用于 CUDA 12.x”
“Ubuntu22.04 x86_64 (Deb) 的本地安装程序”

安装：

sudo dpkg -i cudnn-local-repo-ubuntu2204-8.9.5.30_1.0-1_amd64.deb
sudo cp /var/cudnn-local-repo-ubuntu2204-8.9.5.30/cudnn-local-FB167084-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get install libcudnn8=8.9.5.30-1+cuda12.2
sudo apt-get install libcudnn8-dev=8.9.5.30-1+cuda12.2
sudo apt-get install libcudnn8-samples=8.9.5.30-1+cuda12.2

文档中的“veryfy install”部分并不像我描述的那样工作，但是 GPU 支持不起作用，所以我不在乎。

第五步：张量流

pip install tensorflow==2.15.0

Pip install 工作得很好，所以不再需要编译工具 bazel 和 clang 了。

第 6 步：验证 python 中的 GPU 支持

print(tf.config.list_physical_devices(device_type=None))
>>>[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
>>>Num GPUs Available: 1

Tensorflow GPU 支持在 ubuntu 上不起作用

问题描述投票：0回答：1

1个回答

先决条件：

所需的软件包版本

第 1 步：NVIDIA 驱动程序

STEP2：重启显卡驱动即可生效

第三步：CUDA 12.2

第 4 步：CUDA 12.x 的 cuDNN 8.9.5

下载：

安装：

第五步：张量流

第 6 步：验证 python 中的 GPU 支持

完成

最新问题

Tensorflow GPU 支持在 ubuntu 上不起作用

问题描述 投票：0回答：1

1个回答

先决条件：

所需的软件包版本

第 1 步：NVIDIA 驱动程序

STEP2：重启显卡驱动即可生效

第三步：CUDA 12.2

第 4 步：CUDA 12.x 的 cuDNN 8.9.5

下载：

安装：

第五步：张量流

第 6 步：验证 python 中的 GPU 支持

完成

最新问题

问题描述投票：0回答：1