我正在尝试在 python 中编写用于 excel 上传的 API,并在 AWS Lambda /tmp 目录中处理它。它在 Lambda 引发异常

问题描述 投票:0回答:0

我正在使用 React JS 和 Python Flask 在 python 中编写用于 excel 上传的 API。我能够从 React JS 上传文件并在 Python 请求正文中接收。但是在使用 Openpyxl 库处理 AWS Lambda /tmp/ 目录中的 excel 文件后,我得到了以下 Traceback 异常。

这是我的 Flask 代码:

 def post(self):
    print("Entering File Upload Method #################")
    file = request.files['file']
    file.save("/tmp/" + file.filename)
    print("File Name #################",file.filename)
    recordNumber = 0
    workbook = load_workbook(filename="/tmp/" + file.filename)
    worksheet = workbook.active
    dataDict = {}
    for row in list(worksheet.rows)[1:]:
        dataDict[row[0].value] = [str(c.value) for c in row[1:]]

    print("Data Dictionary::::::", dataDict)

    for key, val in dataDict.items():
        app.db.[collection].update_many({"name": key, "number": int(val[4]), "position": val[1], "status": True}, {
                                            "$set": {"email": (val[7]).strip(), "fullName": (val[8]).strip()}})
        recordNumber = recordNumber + 1
        print("Record Updated In Database: ", recordNumber)
    
    workbook.close()

    return {"result": "File Uploaded Successfully"}, 200

这是我在 AWS Lambda Cloudwatch Logs 得到的异常:

Traceback (most recent call last):
  File "/var/task/flask/app.py", line 1517, in full_dispatch_request
    rv = self.dispatch_request()
  File "/var/task/flask/app.py", line 1503, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "/var/task/flask_restful/__init__.py", line 467, in wrapper
    resp = resource(*args, **kwargs)
  File "/var/task/flask/views.py", line 84, in view
    return current_app.ensure_sync(self.dispatch_request)(*args, **kwargs)
  File "/var/task/flask_restful/__init__.py", line 582, in dispatch_request
    resp = meth(*args, **kwargs)
  File "/var/task/main.py", line 1614, in post
    workbook = load_workbook(filename="/tmp/" + file.filename)
  File "/var/task/openpyxl/reader/excel.py", line 344, in load_workbook
    reader = ExcelReader(filename, read_only, keep_vba,
  File "/var/task/openpyxl/reader/excel.py", line 123, in __init__
    self.archive = _validate_archive(fn)
  File "/var/task/openpyxl/reader/excel.py", line 95, in _validate_archive
    archive = ZipFile(filename, 'r')
  File "/var/lang/lib/python3.9/zipfile.py", line 1266, in __init__
    self._RealGetContents()
  File "/var/lang/lib/python3.9/zipfile.py", line 1361, in _RealGetContents
    raise BadZipFile("Bad magic number for central directory")
zipfile.BadZipFile: Bad magic number for central directory

我已将我的 API 网关配置为接受多部分/表单数据。 我已经尝试在没有 /tmp/ 目录和 AWS Lambda 的情况下在我的本地做完全相同的事情,它工作得很好。 我不明白为什么同样的事情在那里抛出错误。

我期待代码从 Excel 文件读取数据并更新我为其编写查询的 Mongo DB 数据库中的记录。

python mongodb aws-lambda aws-api-gateway openpyxl
© www.soinside.com 2019 - 2024. All rights reserved.