在谷歌应用引擎上安装Torch有问题

问题描述 投票:0回答:1

我已经构建了一个机器学习api,使用Torch作为ML框架。当我把代码上传到Googe App Engine时,它的内存用完了。经过一些调试,我发现问题出在Torch的安装上。

我使用的是Torch 1.5.0和python 3.7.4。

那么我如何解决这个错误呢?也许我可以在app.yaml中修改一些东西?

错误信息。

Step #1 - "builder": OSError: [Errno 12] Cannot allocate memory
Step #1 - "builder":     self.pid = os.fork()
Step #1 - "builder":   File "/usr/lib/python2.7/subprocess.py", line 938, in _execute_child
Step #1 - "builder":     errread, errwrite)
Step #1 - "builder":   File "/usr/lib/python2.7/subprocess.py", line 394, in __init__
Step #1 - "builder":   File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line 346, in _python_version
Step #1 - "builder":   File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line     332, in GetCacheKeyRaw
Step #1 - "builder":   File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line 109, in GetCacheKeyRaw
Step #1 - "builder":   File "/usr/local/bin/ftl.par/__main__/ftl/common/single_layer_image.py", line 60, in GetCacheKey
Step #1 - "builder":   File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line 153, in BuildLayer
Step #1 - "builder":   File "/usr/local/bin/ftl.par/__main__/ftl/python/builder.py", line 114, in Build
Step #1 - "builder":   File "/usr/local/bin/ftl.par/__main__.py", line 54, in main
Step #1 - "builder":   File "/usr/local/bin/ftl.par/__main__.py", line 65, in <module>
Step #1 - "builder":     exec code in run_globals
Step #1 - "builder":   File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
Step #1 - "builder":     "__main__", fname, loader, pkg_name)
Step #1 - "builder":   File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
Step #1 - "builder": Traceback (most recent call last):

当我在requirements.txt中没有包含Torch时,又出现了这个错误信息。

来重现。

app.yaml

runtime: python37
resources:
    memory_gb: 16
    disk_size_gb: 10

要求.txt

gunicorn==20.0.4
aniso8601==8.0.0
beautifulsoup4==4.9.0
boto3==1.13.3
botocore==1.16.3
bs4==0.0.1
certifi==2020.4.5.1
chardet==3.0.4
click==7.1.2
colorama==0.4.3
docutils==0.15.2
filelock==3.0.12
Flask==1.1.2
Flask-RESTful==0.3.8
googletrans==2.4.0
idna==2.9
itsdangerous==1.1.0
Jinja2==2.11.2
jmespath==0.9.5
joblib==0.14.1
MarkupSafe==1.1.1
numpy==1.18.4
protobuf==3.11.3
python-dateutil==2.8.1
pytz==2020.1
regex==2020.4.4
requests==2.23.0
s3transfer==0.3.3
sacremoses==0.0.43
sentencepiece==0.1.86
six==1.14.0
soupsieve==2.0
tokenizers==0.5.2
tqdm==4.46.0
transformers==2.8.0
urllib3==1.25.9
Werkzeug==1.0.1

主.py

import flask
from flask import Flask, request
from flask_restful import Api, Resource

app = Flask(__name__)
api = Api(app)

production = False

import json

# Import api code

# Create main api 'view'
class main_api(Resource):

    def get(self):
        question = request.args.get('question')

        # Run the script
        # But not necessary for the minimum working test

        return {
            'question': question,
            # 'results': results_from_script,
        }

# Adds resource
api.add_resource(main_api, '/')

# Starts the api
if __name__ == '__main__':
    host = '127.0.0.1'
    port = 8080
    app.run(host=host, port=port, debug=not production)
google-app-engine google-cloud-platform pytorch torch
1个回答
2
投票

我使用flex环境修复了这个错误。我唯一需要修改的是app.yaml的内容。

runtime: python
env: flex
entrypoint: gunicorn -b :$PORT main:app
runtime_config:
    python_version: 3

manual_scaling:
    instances: 1
resources:
    cpu: 2
    memory_gb: 5
    disk_size_gb: 10

然后就可以部署了

© www.soinside.com 2019 - 2024. All rights reserved.