我正在尝试通过 Sagemaker 将此模型部署到 AWS: https://huggingface.co/mosaicml/mpt-7b-chat
我从同一页面生成了代码,如下所示:
from sagemaker.huggingface import HuggingFaceModel
import sagemaker
role = sagemaker.get_execution_role()
# Hub Model configuration. https://huggingface.co/models
hub = {
'HF_MODEL_ID':'mosaicml/mpt-7b-chat',
'HF_TASK':'text-generation'
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
transformers_version='4.17.0',
pytorch_version='1.10.2',
py_version='py38',
env=hub,
role=role,
)
# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type='ml.m5.xlarge' # ec2 instance type
)
predictor.predict({
'inputs': "Can you please let us know more details about your "
})
代码在“预测”调用时崩溃并出现以下错误: 加载 /.sagemaker/mms/models/mosaicml__mpt-7b-chat 需要您在本地计算机上执行该存储库中的配置文件。确保您已阅读那里的代码以避免恶意使用,然后设置选项
trust_remote_code\u003dTrue
来删除此错误。
如何通过AWS部署模型?
我使用 HuggingfaceEmbeddings 加载模型,并且遇到了相同的错误,但我通过将 trust_remote_code 添加到 model_kwargs 来解决它:
model_kwargs = {"device": "cpu", "trust_remote_code":"True"}
encode_kwargs = {"normalize_embeddings": True}
bgeEmbeddings = HuggingFaceEmbeddings(
model_name=model_name, model_kwargs=model_kwargs, encode_kwargs=encode_kwargs)
希望我的经历可以给你带来启发