在 Vertex 上使用 Scikit Learn 导入模型

Question

伙计们，我正在尝试从本地导入模型，但每次我从 gcp 日志中都会收到相同的错误。框架是scikit-learn

AttributeError: Can't get attribute 'preprocess_text' on <module 'model_server' from '/usr/app/model_server.py'>

有这个问题的代码片段是

complaints_clf_pipeline = Pipeline(
    [
        ("preprocess", text.TfidfVectorizer(preprocessor=utils.preprocess_text, ngram_range=(1, 2))),
        ("clf", naive_bayes.MultinomialNB(alpha=0.3)),
    ]
)

这个

preprocess_text

来自上面的单元格，但我不断收到 model_server 的此问题，而我的代码中不存在该问题。

有人可以帮忙吗？

我尝试重构代码，但遇到了相同的错误，尝试撤消此管道结构，但随后在尝试通过 API 查阅模型时遇到另一个错误。

Answer 1

GCP 正在尝试加载模型，但找不到 preprocess_text 函数，因为它不包含在序列化模型中。

保存 scikit-learn 管道，像 preprocess_text 这样的函数不会自动与模型一起保存。为了确保 GCP 知道在哪里可以找到此函数，您可以：

在加载模型的同一脚本中定义 preprocess_text，或者将 utils 打包为部署的一部分（将其包含在 GCP 部署文件中），以便 preprocess_text 函数在同一环境中可用。

 import pickle
from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB

class CustomTextClassifier:
    def __init__(self):
        self.pipeline = Pipeline(
            [
                ("preprocess", TfidfVectorizer(preprocessor=self.preprocess_text, ngram_range=(1, 2))),
                ("clf", MultinomialNB(alpha=0.3)),
            ]
        )

    def preprocess_text(self, text):
        
        return text.lower() 

    def train(self, X, y):
        self.pipeline.fit(X, y)

    def predict(self, X):
        return self.pipeline.predict(X)


model = CustomTextClassifier()
# train model with your data...
with open('model.pkl', 'wb') as f:
    pickle.dump(model, f)

在 Vertex 上使用 Scikit Learn 导入模型

问题描述投票：0回答：1

1个回答

最新问题

在 Vertex 上使用 Scikit Learn 导入模型

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1