我使用以下命令在 Ubuntu 22.04 中安装了 CUDA Toolkit 12.6 Update 1(可在此处找到说明):
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.6.1/local_installers/cuda-repo-ubuntu2204-12-6-local_12.6.1-560.35.03-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-12-6-local_12.6.1-560.35.03-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-6-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-6
安装成功。不过,
nvcc
版本仍然是11.5。这是我安装的2个包。我以为安装 CUDA Toolkit 12.6 会取代 11.5,但 NVCC 仍然来自 11.5。
$ dpkg -l nvidia-cuda-toolkit
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-===================-===============-============-=================================
ii nvidia-cuda-toolkit 11.5.1-1ubuntu1 amd64 NVIDIA CUDA development toolkit
$ dpkg -l cuda-toolkit-12-6
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-=================-============-============-=================================
ii cuda-toolkit-12-6 12.6.1-1 amd64 CUDA Toolkit 12.6 meta-package
如何将
nvcc
版本更新至12.6?
这是
nvcc -V
的输出:
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0
列出
cuda-toolkit-12-6
的内容,不包括nvcc
:
$ dpkg -L cuda-toolkit-12-6
/.
/usr
/usr/local
/usr/local/cuda-12.6
/usr/local/cuda-12.6/version.json
/usr/share
/usr/share/doc
/usr/share/doc/cuda-toolkit-12-6
/usr/share/doc/cuda-toolkit-12-6/changelog.Debian.gz
nvidia-smi
的输出:
$ nvidia-smi
Mon Sep 23 13:11:55 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 Off | 00000000:81:00.0 Off | Off |
| 0% 24C P8 21W / 450W | 10893MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce RTX 4090 Off | 00000000:C1:00.0 Off | Off |
| 0% 24C P8 32W / 450W | 818MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1720 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 2174 C /usr/local/bin/ollama 384MiB |
| 0 N/A N/A 1808406 C /opt/tljh/user/bin/python 10486MiB |
| 1 N/A N/A 1720 G /usr/lib/xorg/Xorg 15MiB |
| 1 N/A N/A 1896 G /usr/bin/gnome-shell 10MiB |
| 1 N/A N/A 2174 C /usr/local/bin/ollama 384MiB |
| 1 N/A N/A 1808406 C /opt/tljh/user/bin/python 386MiB |
+---------------------------------------------------------------------------------------+
您应该检查您的环境变量。
nvcc
可能仍指向 CUDA 11.5,因为您的 PATH
和 LD_LIBRARY_PATH
设置为旧版本。
要解决此问题,请通过将以下行添加到您的
.bashrc
(或 .zshrc
)来更新环境变量:
export PATH=/usr/local/cuda-12.6/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-12.6/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda-12.6
然后,获取文件:
source ~/.bashrc
这应该使
nvcc
指向正确的 CUDA 版本。
或创建一个指向新 CUDA 版本的符号链接。