如何为Azure环境安装和配置snakemake

问题描述 投票:0回答:1

我对 Snakemake 和 Azure 云计算都很陌生。我的组织为我设置了 Azure 批处理帐户、资源组和存储帐户。我已成功将数据移动到 Azure blob 中。我想使用snakemake来实现我的分析,并且我有一些非常基本的问题 - 例如snakemake及其依赖项是否应该安装在本地还是Azure VM上,以及如何正确设置配置文件。

我搜索了 Snakemake 文档,但对这些问题一无所获。 我非常感谢那些成功使用了 Azure Blob、Azure Blob 等的 Snakemake 的人的建议和建议。谢谢!

****更新:

我在 WSL 上本地安装了 mambaforge 和 Snakemake。我将教程数据上传到一个名为“snakemake-tutorial”的容器中。 这是我正在使用的代码(在我放置的位置使用了实际值 < placeholders>):

# Log in to Azure CLI
az login --tenant <tenant-ID> --use-device-code

# Set default resource group for Azure CLI commands
az configure --defaults group=<resource-group>


# Define variables for storage account and container
export account_name="<storage-account>"
export account_key=$(az storage account keys list -n $account_name -o tsv | head -n1 | cut -f 4)
export container_name="snakemake-tutorial" # The blob container to use for this analysis
start_date=$(date -u -d "1 minute ago" '+%Y-%m-%dT%H:%MZ')
expiry_date=$(date -u -d "7 days" '+%Y-%m-%dT%H:%MZ')

# Generate SAS token for the *storage account* with read, write, delete, and list permissions
export sas_token=$(az storage account generate-sas \
    --account-name "$account_name" \
    --account-key "$account_key" \
    --resource-types sco \
    --permissions "acdlrw" \
    --services bf \
    --start "$start_date" \
    --expiry "$expiry_date" \
    --https-only \
    --output tsv)

# Set environment variables required for Snakemake and Azure Batch
export AZ_BLOB_ACCOUNT_URL="https://<storage-account>.blob.core.windows.net/?$sas_token"
export AZ_BLOB_PREFIX=$container_name
export AZ_BATCH_ACCOUNT_URL="https://<batch-account>.westus2.batch.azure.com"
export AZ_BATCH_ACCOUNT_KEY="$(az batch account keys list --name "<batch-account>" -o tsv | head -n1 | cut -f2)"
export BATCH_MANAGED_IDENTITY_CLIENT_ID="<client-ID>"
export BATCH_MANAGED_IDENTITY_RESOURCE_ID="<resource-ID>"

# Run the Snakemake workflow with Azure Batch and Azure Blob Storage
snakemake --jobs 3 --verbose \
    --use-conda \
    --container-image snakemake/snakemake \
    --default-remote-provider AzBlob \
    --default-remote-prefix $AZ_BLOB_PREFIX \
    --envvars AZ_BLOB_ACCOUNT_URL \
    --az-batch \
    --az-batch-account-url $AZ_BATCH_ACCOUNT_URL

我收到以下错误:

Traceback (most recent call last): File "/home/molecularlab/mambaforge/envs/snakemake-tutorial/lib/python3.12/site-packages/snakemake/executors/azure_batch.py", line 781, in create_batch_pool self.batch_client.pool.add(new_pool) File "/home/molecularlab/mambaforge/envs/snakemake-tutorial/lib/python3.12/site-packages/azure/batch/operations/_pool_operations.py", line 231, in add raise models.BatchErrorException(self._deserialize, response) azure.batch.models._models_py3.BatchErrorException: Request encountered an exception. Code: AuthorizationFailure Message: {'additional_properties': {}, 'lang': 'en-US', 'value': 'This request is not authorized to perform this operation.\nRequestId:668a8218-90ea-476b-9f0b-52456588ce87\nTime:2024-05-22T04:05:06.8248481Z'}

这是某种后端设置问题吗?我是否需要像此处讨论的那样向 Azure“注册”snakemake https://learn.microsoft.com/en-us/entra/identity-platform/quickstart-register-app

非常感谢您的帮助!

azure snakemake
1个回答
0
投票

为Azure环境安装和配置snakemake,您可以按照以下步骤操作-

通过 az cli 连接到 azure enter image description here

安装相关依赖

conda create -c bioconda -c conda-forge -n snakemake snakemake 

msrest azure-batch azure-storage-blob azure-mgmt-batch azure-identity 

conda activate snakemake

enter image description here enter image description here enter image description here

安装后,您可以使用以下命令进行验证

snakemake --version

参考: 设置 Snakemake

© www.soinside.com 2019 - 2024. All rights reserved.