Azure-sdk-for-cpp 并行性

问题描述 投票:0回答:1

我已在我的项目中集成了新的 azure-sdk-for-cpp。读/写时间非常糟糕。 使用blockBlobClient StageBlock和CommitBlockList,来自SDK的任何示例,几乎1对1相同的实现

有没有办法打开某种标志来启用SDK的并行或多线程以提高上传速度。

 int offset= 0;
while(buffer_size - buffer.size()  < length) {
    int remaining= buffer_size - buffer.size();

    buffer.insert(buffer.end(), data + offset, data + offset + remaining);

    auto block_id= GetBlockId(std::to_string(m_block_count++));

    auto block_content= ::Azure::Core::IO::MemoryBodyStream(buffer.data(), buffer.size());

    uncomitted_blocks_ids.push_back(block_id);

    blob_client.AsBlockBlobClient().StageBlock(block_id, block_content); 

    buffer.clear();

    offset += remaining;
    length -= remaining;
}

client.CommitBlockList(uncomitted_blocks_ids);

return 0;

对于下载,我看到 blob 选项中有并发选项,但测试显示速度提高了约 10-20%

提前致谢!

更新:

我正在寻找如何执行 upload_block_async,它存在于旧的 sdk 中

azure azure-sdk
1个回答
0
投票

有没有办法打开某种标志来启用SDK的并行或多线程以提高上传速度。

您可以使用以下代码来提高Azure C++ SDK中的上传速度,它提供了用于上传块的基本单线程调用,但您可以显式使用多个线程来并行上传。

为此,您可以将数据分为多个块并使用异步操作并同时运行多个上传。

代码:

#include <azure/storage/blobs.hpp>
#include <fstream>
#include <future>
#include <iostream>
#include <iomanip>
#include <sstream>
#include <vector>
#include <chrono>

using namespace Azure::Storage::Blobs;

const int BlockSize = 4 * 1024 * 1024;  
const int MaxConcurrency = 8;

std::string GetBlockId(const std::string& blockNumber) {
    std::stringstream ss;
    ss << std::setw(8) << std::setfill('0') << blockNumber;
    return ss.str();
}

void UploadBlockAsync(const std::string& block_id, const std::vector<uint8_t>& data,
    BlockBlobClient& blockBlobClient) {
    auto block_content = Azure::Core::IO::MemoryBodyStream(data.data(), data.size());
    blockBlobClient.StageBlock(block_id, block_content);
}

int UploadFileWithParallelism(const std::string& file_path, BlockBlobClient& blockBlobClient) {
    std::ifstream file(file_path, std::ios::binary | std::ios::ate);
    if (!file) {
        std::cerr << "Failed to open file: " << file_path << std::endl;
        return 1;
    }

    std::streamsize file_size = file.tellg();
    file.seekg(0, std::ios::beg);

    std::vector<std::string> uncommitted_block_ids;
    std::vector<std::future<void>> futures;
    int block_count = 0;
    int offset = 0;

    auto start = std::chrono::high_resolution_clock::now();

    while (offset < file_size) {
        int current_block_size = std::min(static_cast<int>(BlockSize), static_cast<int>(file_size) - offset);
        std::vector<uint8_t> block_data(current_block_size);
        if (!file.read(reinterpret_cast<char*>(block_data.data()), current_block_size)) {
            std::cerr << "Failed to read block from file." << std::endl;
            return 1;
        }
        auto block_id = GetBlockId(std::to_string(block_count++));
        uncommitted_block_ids.push_back(block_id);
        futures.push_back(std::async(std::launch::async, UploadBlockAsync, block_id, block_data, std::ref(blockBlobClient)));

        offset += current_block_size;
        if (futures.size() >= MaxConcurrency) {
            for (auto& future : futures) future.get();
            futures.clear();
        }
    }

    for (auto& future : futures) future.get();
    blockBlobClient.CommitBlockList(uncommitted_block_ids);
    auto end = std::chrono::high_resolution_clock::now();
    std::chrono::duration<double> elapsed = end - start;
    double time_taken = elapsed.count(); // Time in seconds

    // Calculate upload speed in MB/s
    double upload_speed = (file_size / (1024.0 * 1024.0)) / time_taken;

    std::cout << "Upload completed successfully." << std::endl;
    std::cout << "Time taken: " << time_taken << " seconds" << std::endl;
    std::cout << "Average upload speed: " << upload_speed << " MB/s" << std::endl;

    return 0;
}

int main() {
    // Define your connection string, container/blob name, and file path.
    std::string connection_string = "xxx";
    std::string container_name = "result";
    std::string blob_name = "test.mp4";
    std::string file_path = "xxx";

    // Create a BlockBlobClient using the connection string.
    BlobContainerClient container_client = BlobContainerClient::CreateFromConnectionString(connection_string, container_name);
    BlockBlobClient block_blob_client = container_client.GetBlockBlobClient(blob_name);

    // Perform the upload with parallelism.
    int result = UploadFileWithParallelism(file_path, block_blob_client);

    return result;
}

上面的代码将文件分成chunk(块),使用异步任务并发上传,最后提交块列表完成上传。过程完成后会测量并显示上传时间和速度。

输出:

Upload completed successfully.
Time taken: 15.3139 seconds
Average upload speed: 4.3364 MB/s

enter image description here

上述代码在 15 秒内将

66 MB
文件上传到 Azure Blob 存储中。

传送门: enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.