从 ADLS2/Blob 存储下载连接超时

问题描述 投票:0回答:1

我用python3实现了一个azure函数应用程序(HTTP触发器),它需要从azure ADLS2 blob存储(操作的源数据)下载一个blob。 HTTP 触发器获取带有 JSON 正文的 POST,其中包含相关 BLOB 的 URL。

在调试器上运行该函数时,BLOB 下载可以完美运行。但是,当部署到 azure 中时,大约每 50% 的一次使用 requests.get 的 BLOB 下载都会失败并出现 ConnectionTimeout 错误。我试图通过缩短超时(5秒)并重试五次来避免这个问题,但如果失败一次,它似乎总是失败。几分钟后,相同的操作成功,并且只需要不到一秒钟。

我尝试用谷歌搜索解决方案,但找不到任何相关内容。有什么想法吗?

编辑:我添加了重试,以便 requests.get() 尝试 100 次,超时时间为 3 秒。结果是,当失败时,100次重试全部失败。在请求失败的同时,浏览器中的完全相同的 URL 成功下载了 BLOB。

python-3.x azure azure-functions azure-blob-storage azure-data-lake-gen2
1个回答
0
投票

将 Azure function App 连接到 Application Insights 并跟踪日志以分析连接超时背后的问题。

增加

functionTimeout
中的
host.json

{
  "functionTimeout":"00:30:00"
}

使用以下代码从 ADLS2/Blob 存储下载 Blob:

import azure.functions as func
import logging
from azure.storage.blob import BlobServiceClient
import os
import json

app = func.FunctionApp(http_auth_level=func.AuthLevel.ANONYMOUS)

@app.route(route="http_trigger")
def http_trigger(req: func.HttpRequest) -> func.HttpResponse:
    logging.info('Python HTTP trigger function processed a request.')

    blob_name = req.params.get('blob_name')
    if not blob_name:
        return func.HttpResponse(
            "Please provide a 'blob_name' parameter in the query string.",
            status_code=400
        )
    else:
        blob_connect_str = os.environ["StorageAccountConnectionString"]
        blob_service_client = BlobServiceClient.from_connection_string(blob_connect_str)
        blob_client = blob_service_client.get_blob_client(container="container", blob=blob_name)

        try:
            blob_data = blob_client.download_blob()
            binary_data = blob_data.readall()

            return func.HttpResponse("Downloaded the file successfully.", status_code=200)

        except Exception as e:
            return func.HttpResponse(f"An error occurred: {str(e)}", status_code=500)

部署到 Azure 函数应用。

能够运行已部署的功能:

enter image description here

enter image description here

输出:

2024-10-21T13:26:45Z   [Information]   Executing 'Functions.http_trigger' (Reason='This function was programmatically called via the host APIs.', Id=84077372-207d-4248-b988-ff4542a36b4e)
2024-10-21T13:26:45Z   [Verbose]   Sending invocation id: '84077372-207d-4248-b988-ff4542a36b4e
2024-10-21T13:26:45Z   [Verbose]   Posting invocation id:84077372-207d-4248-b988-ff4542a36b4e on workerId:d24c1688-b527-4045-87d8-64d17257cc2a
2024-10-21T13:26:45Z   [Information]   Python HTTP trigger function processed a request.
2024-10-21T13:26:45Z   [Information]   Request URL: 'https://storagename.blob.core.windows.net/container/Hello%20World.txt'
Request method: 'GET'
Request headers:
    'x-ms-range': 'REDACTED'
    'x-ms-version': 'REDACTED'
    'Accept': 'application/xml'
    'User-Agent': 'azsdk-python-storage-blob/12.23.1 Python/3.11.9 (Linux-5.10.102.2-microsoft-standard-x86_64-with-glibc2.31)'
    'x-ms-date': 'REDACTED'
    'x-ms-client-request-id': '1ddf1f08-8fb0-11ef-bfb7-00155d54f124'
    'Authorization': 'REDACTED'
No body was attached to the request
2024-10-21T13:26:45Z   [Information]   Response status: 206
Response headers:
    'Content-Length': '11'
    'Content-Type': 'text/plain'
    'Content-Range': 'REDACTED'
    'Last-Modified': 'Mon, 21 Oct 2024 12:38:25 GMT'
    'Accept-Ranges': 'REDACTED'
    'ETag': '"0x8DCF1CD418FC184"'
    'Server': 'Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0'
    'x-ms-request-id': '31832257-401e-0019-51bc-23ce13000000'
    'x-ms-client-request-id': '1ddf1f08-8fb0-11ef-bfb7-00155d54f124'
    'x-ms-version': 'REDACTED'
    'x-ms-creation-time': 'REDACTED'
    'x-ms-blob-content-md5': 'REDACTED'
    'x-ms-lease-status': 'REDACTED'
    'x-ms-lease-state': 'REDACTED'
    'x-ms-blob-type': 'REDACTED'
    'x-ms-server-encrypted': 'REDACTED'
    'Date': 'Mon, 21 Oct 2024 13:26:44 GMT'
2024-10-21T13:26:45Z   [Information]   Executed 'Functions.http_trigger' (Succeeded, Id=84077372-207d-4248-b988-ff4542a36b4e, Duration=98ms)
© www.soinside.com 2019 - 2024. All rights reserved.