我在 4xA100 GPU 服务器上运行 Ollma,但看起来只有 1 个 GPU 用于
LLaMa3:7b
模型。
如何同时使用所有 4 个 GPU?
我没有使用 docker
,只使用 ollama serve
和 ollama run
。
或者有没有办法同时运行 4 个服务器进程(每个进程在不同的端口上)以进行大型批处理?
Wed May 15 01:24:29 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100 80GB PCIe On | 00000000:17:00.0 Off | 0 |
| N/A 63C P0 293W / 300W | 39269MiB / 81920MiB | 88% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA A100 80GB PCIe On | 00000000:65:00.0 Off | 0 |
| N/A 28C P0 51W / 300W | 7MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 2 NVIDIA A100 80GB PCIe On | 00000000:CA:00.0 Off | 0 |
| N/A 28C P0 51W / 300W | 7MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 3 NVIDIA A100 80GB PCIe On | 00000000:E3:00.0 Off | 0 |
| N/A 29C P0 52W / 300W | 7MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 3420401 C ...unners/cuda_v11/ollama_llama_server 39256MiB |
+---------------------------------------------------------------------------------------+
刚刚找到使用单独端口运行多个实例的方法。
OLLAMA_HOST=135.197.255.43:11432 ./ollama serve
OLLAMA_HOST=135.197.255.43:11433 ./ollama serve
OLLAMA_HOST=135.197.255.43:11434 ./ollama serve
OLLAMA_HOST=135.197.255.43:11435 ./ollama serve