以下代码成功将 CSV 和 XLSX 文件上传到 Google Cloud Storage - 正如预期,因为它来自 从文件系统上传对象
然而,大约 2,000KB 的 CSV 返回了错误消息:
('Connection aborted.', TimeoutError('The write operation timed out'))
还有其他函数具有超时选项,但这里不是这种情况,我考虑过根据下面的阅读列表修改存储桶方法。
from google.cloud.storage import Client, transfer_manager
def upload_many_blobs_with_transfer_manager(input_logger,
bucket_name, filenames, source_directory="", workers=8):
"""Upload every file in a list to a bucket, concurrently in a process pool.
Each blob name is derived from the filename, not including the
`source_directory` parameter. For complete control of the blob name for each
file (and other aspects of individual blob metadata), use
transfer_manager.upload_many() instead.
"""
storage_client = Client()
bucket = storage_client.bucket(bucket_name)
results = transfer_manager.upload_many_from_filenames(
bucket, filenames, source_directory=source_directory, max_workers=workers
)
for name, result in zip(filenames, results):
# The results list is either `None` or an exception for each filename in
# the input list, in order.
if isinstance(result, Exception):
input_logger.info("Failed to upload {} due to exception: {}".format(name, result))
else:
input_logger.info("Uploaded {} to {}.".format(name, bucket_name))
感谢所有帮助。
内特比
我已阅读:
https://googleapis.dev/python/storage/latest/retry_timeout.html
为什么 upload_from_file Google Cloud Storage 函数会抛出超时错误? 我正在考虑基于此将
bucket = storage_client.bucket(bucket_name)
更改为 bucket = client.get_bucket(BUCKET_NAME, timeout=300.0) # five minutes
但先在这里询问。
我正在考虑将
更改为bucket = storage\_client.bucket(bucket\_name)
# 五分钟bucket = client.get\_bucket(BUCKET\_NAME, timeout=300.0)
是的,你是对的。 根据您分享的Stackoverflow链接,最终的解决方案应该是在创建存储桶客户端时定义超时。
也按照文档
中的建议您可以传递单个整数或浮点数,作为整个请求的超时时间。 >例如:
bucket = client.get\_bucket(BUCKET\_NAME, timeout=300.0) \# five minutes
您还可以尝试按照文档中的建议配置重试机制。
您还需要根据文件大小查看您的网络速度。您可以使用上传时间计算器来检查。
如果您的互联网连接状况不佳,您也可以尝试调整上传的块大小(尽管不推荐)
查看Github链接以供参考。