我是 docker 新手,如果这是一个愚蠢的问题,请原谅我,但最近,我一直在尝试测试
faster-whisper
,OpenAI 的 Whisper 的重新实现,为了测试这一点,我使用了一个 docker 容器,其中包含来自 docker hub 的图像nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu22.04
在 ubuntu、wsl2 上。我成功构建了图像,但是当我运行它时,出现此错误:
==========
== CUDA ==
==========
CUDA Version 11.7.1
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
Traceback (most recent call last):
File "/home/user/Documents/experiment/./main.py", line 8, in <module>
model = WhisperModel(model_size, device="cuda", compute_type="float16")
File "/usr/local/lib/python3.10/dist-packages/faster_whisper/transcribe.py", line 128, in __init__
self.model = ctranslate2.models.Whisper(
RuntimeError: CUDA failed with error no CUDA-capable device is detected
我用谷歌搜索并尝试了很多可能的解决方案,甚至完全重新安装了 nvidia 驱动程序、nvidia-container-toolkit 和 docker,但错误似乎仍然存在。
这是我用来构建镜像的 Dockerfile:
FROM nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu22.04
RUN apt -y update && apt -y install python3.11 python3-pip
ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES utility,compute
WORKDIR /home/user/Documents/experiment
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD [ "python3", "./main.py" ]
这是Python脚本,我主要是从官方github页面复制粘贴的:
from faster_whisper import WhisperModel
import os
os.environ['CUDA_VISIBLE_DEVICES'] = "0"
model_size = "large-v2"
# Run on GPU with FP16
model = WhisperModel(model_size, device="cuda", compute_type="float16")
# or run on GPU with INT8
# model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")
# or run on CPU with INT8
# model = WhisperModel(model_size, device="cpu", compute_type="int8")
segments, info = model.transcribe("./audio.wav", beam_size=5)
print("Detected language '%s' with probability %f" % (info.language, info.language_probability))
for segment in segments:
print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
这是我尝试运行容器的命令:
docker run --gpus all --runtime=nvidia -t nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu22.04
这是命令的 nvidia-smi 输出
docker run --gpus all --runtime=nvidia -t nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu22.04 nvidia-smi
:
Sun Oct 8 22:08:24 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.112 Driver Version: 537.42 CUDA Version: 11.7 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3080 Ti On | 00000000:01:00.0 On | N/A |
| 0% 39C P8 21W / 400W | 522MiB / 12288MiB | 1% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 32 G /Xwayland N/A |
+---------------------------------------------------------------------------------------+
有什么想法吗?
也许你需要确认你的应用程序环境,指定火炬或其他网络框架,例如你的环境需要 cu11.8 但你的 cuda 版本是 12.1 ,或者你的设备不支持 11.8 ,它会引发此错误。
首先检查您的本地环境和应用程序环境。