ScikitLearn 模型给出“LocalOutlierFactor”对象没有属性“预测”错误

Question

我是机器学习领域的新手，我已经使用 ScikitLearn 库构建并训练了一个机器学习模型。它在 Jupyter 笔记本中运行得很好，但是当我将此模型部署到 Google Cloud ML 并尝试使用 Python 脚本提供服务时，它会抛出错误。

这是我的模型代码的片段：

更新：

from sklearn.metrics import classification_report, accuracy_score
from sklearn.ensemble import IsolationForest
from sklearn.neighbors import LocalOutlierFactor

# define a random state
state = 1

classifiers = {
    "Isolation Forest": IsolationForest(max_samples=len(X),
                                       contamination=outlier_fraction,
                                       random_state=state),
    # "Local Outlier Factor": LocalOutlierFactor(
    # n_neighbors = 20,
    # contamination = outlier_fraction)
}

import pickle
# fit the model
n_outliers = len(Fraud)

for i, (clf_name, clf) in enumerate(classifiers.items()):

    # fit te data and tag outliers
    if clf_name == "Local Outlier Factor":
        y_pred = clf.fit_predict(X)
        print("LOF executed")
        scores_pred = clf.negative_outlier_factor_
        # Export the classifier to a file
        with open('model.pkl', 'wb') as model_file:
            pickle.dump(clf, model_file)
    else:
        clf.fit(X)
        scores_pred = clf.decision_function(X)
        y_pred = clf.predict(X)
        print("IF executed")
        # Export the classifier to a file
        with open('model.pkl', 'wb') as model_file:
            pickle.dump(clf, model_file)
    # Reshape the prediction values to 0 for valid and 1 for fraudulent
    y_pred[y_pred == 1] = 0
    y_pred[y_pred == -1] = 1

    n_errors = (y_pred != Y).sum()

# run classification metrics 
print('{}:{}'.format(clf_name, n_errors))
print(accuracy_score(Y, y_pred ))
print(classification_report(Y, y_pred ))

这是 Jupyter Notebook 中的输出：

隔离森林：7

0.93

               precision    recall  f1-score   support


         0       0.97      0.96      0.96        94
         1       0.43      0.50      0.46         6

  avg / total    0.94      0.93      0.93       100

我已将此模型部署到 Google Cloud ML-Engine，然后尝试使用以下 python 脚本提供服务：

import os
from googleapiclient import discovery
from oauth2client.service_account import ServiceAccountCredentials
credentials = ServiceAccountCredentials.from_json_keyfile_name('Machine Learning 001-dafe42dfb46f.json')

PROJECT_ID = "machine-learning-001-201312"
VERSION_NAME = "v1"
MODEL_NAME = "mlfd"
service = discovery.build('ml', 'v1', credentials=credentials)
name = 'projects/{}/models/{}'.format(PROJECT_ID, MODEL_NAME)
name += '/versions/{}'.format(VERSION_NAME)

data = [[265580, 7, 68728, 8.36, 4.76, 84.12, 79.36, 3346, 1, 11.99, 1.14,655012, 0.65, 258374, 0, 84.12] ]

response = service.projects().predict(
    name=name,
    body={'instances': data}
).execute()

if 'error' in response:
  print (response['error'])
else:
  online_results = response['predictions']
  print(online_results)

这是该脚本的输出：

预测失败：sklearn 预测期间出现异常：“LocalOutlierFactor”对象没有属性“predict”

Answer 1

LocalOutlierFactor

没有

predict

方法，只有私有

_predict

方法。这是来自来源的理由。

def _predict(self, X=None):
    """Predict the labels (1 inlier, -1 outlier) of X according to LOF.
    If X is None, returns the same as fit_predict(X_train).
    This method allows to generalize prediction to new observations (not
    in the training set). As LOF originally does not deal with new data,
    this method is kept private.

https://github.com/scikit-learn/scikit-learn/blob/a24c8b46/sklearn/neighbors/lof.py#L200

Answer 2

看起来这可能是 Python 版本的问题（尽管我不清楚为什么 scikit learn 在 Python 2 和 Python 3 中的行为不同）。我能够在本地（在同一台机器上）验证我的 Python 2 安装重现了上述错误，而 Python 3 成功（两者都使用 sci-kit learn 0.19.1）。

解决方案是在部署模型时指定 python 版本（注意最后一行，如果省略，则默认为“2.7”）：

gcloud beta ml-engine versions create $VERSION_NAME \
    --model $MODEL_NAME --origin $DEPLOYMENT_SOURCE \
    --runtime-version="1.5" --framework $FRAMEWORK
    --python-version="3.5"

Answer 3

令人惊讶的是，问题是

runtime version

，当您重新创建模型版本时，问题就会得到解决：

gcloud beta ml-engine versions create $VERSION_NAME  --model $MODEL_NAME --origin $DEPLOYMENT_SOURCE --runtime-version="1.6" --framework $FRAMEWORK --python-version="3.5"

使用Runtime版本1.6代替1.5，至少将其变成运行模型。

Answer 4

创建 LocalOutlierFactor 实例有两种方法

clf = LocalOutlierFactor（n_neighbors = n_neighbors，污染=污染，新颖性=假）

在这种情况下我们可以使用：clf.fit_predict(X)

clf = LocalOutlierFactor（n_neighbors = n_neighbors，污染=污染，新颖性= True）

在这种情况下我们应该使用：clf.predict(X)

谢谢

Answer 5

我从事过一个看起来非常相似的项目。我遇到了同样的错误。我的问题是 if 语句中的拼写错误。

问候洛伦兹

ScikitLearn 模型给出“LocalOutlierFactor”对象没有属性“预测”错误

问题描述投票：0回答：5

5个回答

最新问题

ScikitLearn 模型给出“LocalOutlierFactor”对象没有属性“预测”错误

问题描述 投票：0回答：5

5个回答

最新问题

问题描述投票：0回答：5