Tensorflow Docker 不使用 GPU

我正在尝试让 Tensorflow 在带有 GPU 的 Ubuntu 24.04.1 上运行。


Docker 是在 GPU 上运行 TensorFlow 的最简单方法,因为主机只需要 NVIDIA® 驱动程序

所以我正在尝试使用 Docker。


docker run --gpus all --rm nvidia/cuda:12.6.2-cudnn-runtime-ubuntu24.04 nvidia-smi
检查以确保我的 GPU 能够与 Docker 配合使用。其输出是:

== CUDA ==

CUDA Version 12.6.2

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

Sat Oct 26 01:16:50 2024
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA TITAN RTX               Off |   00000000:01:00.0 Off |                  N/A |
| 41%   40C    P8             24W /  280W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |

| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|  No running processes found                                                             |


docker run --gpus all --rm nvidia/cuda nvidia-smi


docker run --gpus all -it --rm tensorflow/tensorflow:latest-gpu \
   python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"


2024-10-26 01:20:51.021242: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1729905651.033544       1 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1729905651.037491       1 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-26 01:20:51.050486: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
W0000 00:00:1729905652.350499       1 gpu_device.cc:2344] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...

这表明 Tensorflow 没有检测到 GPU。


我不认为你做错了什么,但我担心该图像可能是一个缺少完整图像的“pip install”。

我正在运行不同风格的 Linux,但首先我必须确保我的 GPU 可供 docker 使用(请参阅此处 将 nvidia 运行时添加到 docker 运行时),并且我将我的 cuda 版本升级到最新版本。



docker run -it --rm --runtime=nvidia --gpus all tensorflow/tensorflow:latest-gpu /bin/bash


pip install tensorflow[and-cuda]



python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

您将希望使用他们的 docker 映像作为基础来创建自己的 docker 映像,例如:

# Use the official TensorFlow GPU base image
FROM tensorflow/tensorflow:latest-gpu

# Install TensorFlow with CUDA support
RUN pip install tensorflow[and-cuda]

# To make sure the CMD from the base image is preserved
CMD ["bash"]
