https://huggingface.co/microsoft/phi-3-mini-128k-instruct-onnx
from transformers import AutoTokenizer, AutoModelForCausalLM
# This works just fine (normal version but too big for my GPU)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-128k-instruct",trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-128k-instruct",trust_remote_code=True)
# But this throws an error (quantized version)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-128k-instruct-onnx", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-128k-instruct-onnx", trust_remote_code=True)
检查此链接
该教程下载并运行PHI-3迷你短上下文模型。
:https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/phi-3-tutorial.md