我正在使用 ECS FARGATE、ALB、Target Group 和实例 IP。 当我的服务创建任务时,它会收到以下错误并且任务失败。 发生此问题后,在任务正在运行并达到稳定状态的某个时间点,正在创建新任务。 请参阅附图和任务定义
出现此错误:
任务停止于:2023-12-15T06:06:25.165Z ResourceInitializationError:无法提取机密或注册表身份验证:执行资源检索失败:无法检索ecr注册表身份验证:服务调用已重试3次:RequestError:发送请求失败,原因是:发布“https://api.ecr” .ap-south-1.amazonaws.com/":拨打 tcp 13.234.9.92:443:i/o 超时。请检查您的任务网络配置。
同样,当我创建具有 2 个所需任务的服务时,其中一个将运行良好,第二个任务会遇到与上述相同的问题,而当我指定 3 个所需任务时,第 2 个任务将运行良好,第三个任务会出现问题(与上面提到的相同错误)
我的任务定义如下:
-
{
"taskDefinitionArn": "arn:aws:ecs:ap-south-1:6xxxx5:task-definition/qrc-bg-test-v1-TaskDefinition:2",
"containerDefinitions": [
{
"name": "qrc-bg-test-v1-Container",
"image": "6xxx5.dkr.ecr.ap-south-1.amazonaws.com/qrc:latest",
"cpu": 0,
"links": [],
"portMappings": [
{
"containerPort": 5000,
"hostPort": 5000,
"protocol": "tcp"
}
],
"essential": true,
"entryPoint": [],
"command": [],
"environment": [],
"environmentFiles": [],
"mountPoints": [],
"volumesFrom": [],
"secrets": [],
"dnsServers": [],
"dnsSearchDomains": [],
"extraHosts": [],
"dockerSecurityOptions": [],
"dockerLabels": {},
"ulimits": [],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-create-group": "true",
"awslogs-group": "/ecs/qrc-bg-test-v1TaskDefinition",
"awslogs-region": "ap-south-1",
"awslogs-stream-prefix": "ecs"
},
"secretOptions": []
},
"systemControls": []
}
],
"family": "qrc-bg-test-v1-TaskDefinition",
"taskRoleArn": "arn:aws:iam::6xxxx5:role/ecsTaskExecutionRole",
"executionRoleArn": "arn:aws:iam::6xxx5:role/ecsTaskExecutionRole",
"networkMode": "awsvpc",
"revision": 2,
"volumes": [],
"status": "ACTIVE",
"requiresAttributes": [
{
"name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
},
{
"name": "ecs.capability.execution-role-awslogs"
},
{
"name": "com.amazonaws.ecs.capability.ecr-auth"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.17"
},
{
"name": "com.amazonaws.ecs.capability.task-iam-role"
},
{
"name": "ecs.capability.execution-role-ecr-pull"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
},
{
"name": "ecs.capability.task-eni"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.29"
}
],
"placementConstraints": [],
"compatibilities": [
"EC2",
"FARGATE"
],
"requiresCompatibilities": [
"FARGATE"
],
"cpu": "1024",
"memory": "3072",
"registeredAt": "2023-12-15T06:00:50.419Z",
"registeredBy": "arn:aws:sts::6xxxxx5:assumed-role/B_Role",
"tags": [] }
因为这只是一些任务失败,而其他任务似乎可以工作(使用相同的配置)。 请检查您是否在多个子网中启动任务,可能是某些任务在无法到达 ecr-api 的子网中启动。
例如,如果任务开始于: 无法访问互联网的私有子网(根据错误消息中的 IP,这似乎是它现在正在尝试的),或无法访问该 API 的 VPC 端点。
偶尔你会很幸运,任务会在具有必要访问权限的子网中启动。