我正在尝试使用 Huggingface 在我的 Colab 笔记本中本地运行 Llama-3-8B 模型。加载模型时,检查点分片在 25% 时停止加载。我不明白这可能是什么问题。
from transformers import AutoModelForCausalLM, AutoTokenizer
# Define the model name (this is a placeholder, replace with the actual model name)
model_name = "meta-llama/Meta-Llama-3-8B"
!huggingface-cli login --token $HF_TOKEN
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# If the model is large, it might be beneficial to move it to GPU
model.to('cuda')
HF_Token 已定义,出于隐私原因在此未提及。
提示如下错误:
Your token has been saved to /root/.cache/huggingface/token
Login successful
/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:89: UserWarning:
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
warnings.warn(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 25%
1/4 [00:22<01:07, 22.37s/it]
解决了这个问题,协作笔记本出现存储错误,这就是它停止的原因。