我在conda中使用unsloth时遇到bitsandbytes错误

Question

我正在尝试使用其中一本 Colab 笔记本中提供的代码来微调 Unsloth 中的 llama3 模型，但在我的系统上运行代码时遇到了一些问题。

以下是我遇到的错误。我正在使用 conda 虚拟环境我正在关注 github

上的教程

(unsloth_env) llm@llm:/mnt/ssd/unsloth$ python3 run_unsloth.py
WARNING: BNB_CUDA_VERSION=121 environment variable detected; loading libbitsandbytes_cuda121_nocublaslt121.so.
This can be used to load a bitsandbytes version that is different from the PyTorch CUDA version.
If this was unintended set the BNB_CUDA_VERSION variable to an empty string: export BNB_CUDA_VERSION=
If you use the manual override make sure the right libcudart.so is in your LD_LIBRARY_PATH
For example by adding the following to your .bashrc: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<path_to_cuda_dir/lib64

Could not find the bitsandbytes CUDA binary at PosixPath('/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda121_nocublaslt121.so')
The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py:72: UserWarning: Unsloth: Running `ldconfig /usr/lib64-nvidia` to link CUDA.
  warnings.warn(
/sbin/ldconfig.real: Can't create temporary cache file /etc/ld.so.cache~: Permission denied
/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py:103: UserWarning: Unsloth: CUDA is not linked properly.
Try running `python -m bitsandbytes` then `python -m xformers.info`
We tried running `ldconfig /usr/lib64-nvidia` ourselves, but it didn't work.
You need to run in your terminal `sudo ldconfig /usr/lib64-nvidia` yourself, then import Unsloth.
Also try `sudo ldconfig /usr/local/cuda-xx.x` - find the latest cuda version.
Unsloth will still run for now, but maybe it might crash - let's hope it works!
  warnings.warn(
Traceback (most recent call last):
  File "/mnt/ssd/unsloth/run_unsloth.py", line 2, in <module>
    import unsloth
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py", line 113, in <module>
    from .models import *
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/__init__.py", line 15, in <module>
    from .loader import FastLanguageModel
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/loader.py", line 15, in <module>
    from .llama import FastLlamaModel, logger
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py", line 26, in <module>
    from ..kernels import *
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/__init__.py", line 15, in <module>
    from .cross_entropy_loss import fast_cross_entropy_loss
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/cross_entropy_loss.py", line 18, in <module>
    from .utils import calculate_settings, MAX_FUSED_SIZE
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/utils.py", line 36, in <module>
    cdequantize_blockwise_fp32      = bnb.functional.lib.cdequantize_blockwise_fp32
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 73, in __getattr__
    return getattr(self._lib, item)
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/ctypes/__init__.py", line 387, in __getattr__
    func = self.__getitem__(name)
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/ctypes/__init__.py", line 392, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cdequantize_blockwise_fp32

我尝试匹配 Cuda 和 torch 版本，但没有成功

Answer 1

我确信这对大多数人来说没有帮助，但对我有用的是：

systemctl start nvidia-fabricmanager.service

我在conda中使用unsloth时遇到bitsandbytes错误

问题描述投票：0回答：1

1个回答

最新问题

我在conda中使用unsloth时遇到bitsandbytes错误

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1