我们正在使用 Terraform 来部署我们的基础设施。因此,我们使用 Azure 存储帐户来托管状态文件。显然,我们更希望不向公共网络开放存储帐户,而是暂时将代理的 IP 地址添加到防火墙规则中。
我们有以下脚本来更新存储帐户的固件规则,然后尝试提取 blob。只要失败,它就会休眠,然后再次尝试,最大阈值约为 10m。
$attemptCount = 1
$attempInterval = 5
$maxAttempts = ((60 * 10) + $attempInterval) / $attempInterval # Try for 10 mins
# Traditional try/catch does not work in ADO pipelines for whatever reason
# Retry logic below was created using this as ref: https://devopsjournal.io/blog/2019/07/12/Azure-CLI-PowerShell
while ($true)
{
if ($attemptCount -gt $maxAttempts)
{
Write-Host "Maximum attempts made on firewall rule / access. Aborting script."
return 1
}
Write-Host "Trying storage account firewall (attempt $attemptCount of $maxAttempts)"
$output = az storage blob exists --account-key $key --account-name $storageAccountName --container-name "tfstate" --name someBlob | ConvertFrom-Json
Write-Debug "Last exit code: $LASTEXITCODE"
if ($LASTEXITCODE -gt 0)
{
Write-Host "Storage account access failed; sleeping"
$attemptCount = $attemptCount + 1
Start-Sleep -Seconds $attempInterval
}
else
{
Write-Host "Successfully connected to service account"
break
}
}
这有时似乎有效:
<Storage account logs>
Trying storage account firewall (attempt 1 of 121)
ERROR:
The request may be blocked by network rules of storage account. Please check network rule set using 'az storage account show -n accountname --query networkRuleSet'.
If you want to change the default action to apply when no rule matches, please use 'az storage account update'.
Storage account access failed; sleeping
Trying storage account firewall (attempt 2 of 121)
Successfully connected to service account
<Terraform init logs>
Initializing the backend...
Successfully configured the backend "azurerm"! Terraform will automatically
use this backend unless the backend configuration changes.
而其他时候,则不然:
<Storage account logs>
Trying storage account firewall (attempt 1 of 121)
ERROR:
The request may be blocked by network rules of storage account. Please check network rule set using 'az storage account show -n accountname --query networkRuleSet'.
If you want to change the default action to apply when no rule matches, please use 'az storage account update'.
Storage account access failed; sleeping
Trying storage account firewall (attempt 2 of 121)
Successfully connected to service account
<Terraform init logs>
Initializing the backend...
Error: Failed to get existing workspaces: containers.Client#ListBlobs: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailure" Message="This request is not authorized to perform this operation.\nRequestId:XXXXXXXXXXXXXXXXX\nTime:XXXXXXXXXXXXXX"
请注意,这都是在同一阶段和相同作业中的同一个代理上运行的,所以我的问题是:为什么当 Terraform 运行时,代理似乎在几个步骤后“失去”访问权限?
我更新了 PowerShell 以创建一个文件,而不仅仅是检查是否存在,想知道是否存在分层效果,也许我可以检查文件,但实际上无法读/写。
$output = az storage blob upload -f $testFilePath --account-name $storageAccountName --account-key $key --container-name "tfstate" -n connTest --overwrite | ConvertFrom-Json
这也展示了它将文件向上推的行为,但 Terraform 的初始化错误。
我还发现我对测试文件的清理有时也会失败,这意味着在推送文件然后尝试在同一脚本中删除该文件之间,会出现一些混乱。
Trying storage account firewall (attempt 2 of 121)
<File is uploaded here>
Alive[################################################################] 100.0000%
Finished[#############################################################] 100.0000%
Successfully connected to service account
<File fails to be deleted here>
ERROR:
The request may be blocked by network rules of storage account. Please check network rule set using 'az storage account show -n accountname --query networkRuleSet'.
If you want to change the default action to apply when no rule matches, please use 'az storage account update'.
假设您在描述中提到的代理这全部在同一阶段和同一作业中的相同代理上运行,是在与Azure存储帐户资源相同的区域中创建的Azure VM并配置为自托管管道代理,从 VM IP 访问 Azure 存储帐户的间歇性失败可能与本文档中介绍的以下限制有关。
限制对部署在与存储帐户**相同**区域中的 Azure 服务的访问。与存储帐户部署在同一区域的服务使用我们可以使用私有Azure IP 地址进行通信。因此,您不能根据特定 Azure 服务的公共出站 IP 地址范围限制对特定 Azure 服务的访问。
虚拟网络规则来允许同区域请求,而不是将IP范围添加到存储帐户允许列表中,这只会检查请求是否从预期的VNet发送,无论它们是通过公共还是私有发送的IP。
如果您的管道万一在 Microsoft 托管的代理上运行,请考虑在 VNet 规则的允许列表中设置自托管代理。