ECS Fargate 容器未使用 VPC 端点从 ECR 拉取

问题描述 投票:0回答:1

在这个问题上已经坚持了一个星期了。

所以我有一个在私有子网中提供服务的fargate容器,我想限制容器单独访问私有网络,但我无法通过私有网络从我的私有ecr存储库中提取图像

  • 为以下对象创建 VPC 端点:ecr-api、ecr-dkr、s3(网关,还设置到私有子网的路由表)、日志。在可能的情况下为他们启用私有 dns,并为 0.0.0.0/0 打开他们的 SG(用于测试目的)。
  • 对于 Fargate SG,我向整个 vpc CIDR 以及附加到 VPC 端点的安全组开放了入口/出口。
  • 双方均验证了 IAM 权限:ecr 存储库具有允许所有用户在存储库上执行操作的策略,并且 Fargate 任务角色还包含所有相关的 iam 权限

启动容器时,出现以下错误:

CannotPullContainerError: ref pull has been retried 5 time(s): failed to copy: httpReadSeeker: failed open: failed to do request: Get 956469741060.dkr.ecr.us-east-1.amazonaws.com/my-ecr-repo:latest: dial tcp 52.216.78.32:443: i/o timeout

因此容器仍在尝试通过公共 IP 拉取 ECR 映像(我的 vpc cidr 是 10.0.0.0/16)。 不用说,一旦我为我的 Fargate 出口打开 0.0.0.0/0,fargate 容器就能够拉取 ecr 映像,但我想避免这种情况,只允许进入/退出私有子网。

我通过在私有子网中启动 ec2 实例来确认 VPC 端点配置,并在上述所有 VPC 端点上运行 nslookup,并且所有端点都返回私有 ip,因此这告诉我端点实际上配置正确

由于 ec2 nslookup 测试,我会假设问题出在我的 Fargate 配置中,这就是 terraform 设置的样子:

resource "aws_ecs_cluster" "test_sdk" {
  name = "test-sdk-${var.stage}"
}

resource "aws_ecs_task_definition" "test_task_def" {
  family                   = "test-sdk-${var.stage}"
  network_mode             = "awsvpc"
  task_role_arn            = aws_iam_role.ecs_task_execution_role.arn
  execution_role_arn       = aws_iam_role.ecs_task_execution_role.arn
  requires_compatibilities = ["FARGATE"]
  cpu                      = 4096
  memory                   = 8192
  container_definitions = jsonencode(
    [
    {
      "name": "test-container",
      "image": "${data.aws_caller_identity.self.account_id}.dkr.ecr.${var.region}.amazonaws.com/test-sdk-${var.stage}:latest",
      "essential": true,
      "portMappings": [
        {
          "containerPort": var.container_port,
          "hostPort": var.container_port

        }
      ],

      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "ecs-test-${var.stage}",
          "awslogs-region": "${var.region}",
          "awslogs-stream-prefix": "streaming"
        }
      }
      }
]
  )
}

resource "aws_ecs_service" "test_service" {
  name            = "test-service"
  cluster         = aws_ecs_cluster.test_sdk.id
  task_definition = aws_ecs_task_definition.test_task_def.arn
  launch_type     = "FARGATE" 
  desired_count   = 1

  network_configuration {
    subnets = [data.aws_subnet.private-1.id, data.aws_subnet.private-2.id]
    security_groups = [aws_security_group.test-sg.id]
  }

  load_balancer {
    target_group_arn = aws_lb_target_group.test-tg.arn
    container_name   = "test-container"
    container_port   = var.container_port
  }
}

# Create a security group allowing traffic on container port
resource "aws_security_group" "test-sg" {
  name   = "test-sg-${var.stage}"
  vpc_id = data.aws_vpc.vpc.id

  ingress {
    from_port   = var.container_port
    to_port     = var.container_port
    protocol    = "tcp"
    cidr_blocks = [
       data.aws_subnet.private-1.cidr_block,
       data.aws_subnet.private-2.cidr_block
      ]
  }

  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = [
       data.aws_subnet.private-1.cidr_block,
       data.aws_subnet.private-2.cidr_block
      ] # Allow traffic from private subnet
  }
  


  egress {
    from_port   = var.container_port
    to_port     = var.container_port
    protocol    = "tcp"
    cidr_blocks = [
       data.aws_subnet.private-1.cidr_block,
       data.aws_subnet.private-2.cidr_block
      ] # Allow traffic from private subnet
  }
  egress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = [
       data.aws_subnet.private-1.cidr_block,
       data.aws_subnet.private-2.cidr_block
        ] # Allow traffic from private subnet
  }

}

# Create Application Load Balancer
resource "aws_lb" "test" {
  name               = "test-lb-${var.stage}"
  internal           = true
  load_balancer_type = "application"
  security_groups    = [aws_security_group.test-sg.id]
  subnets            = [data.aws_subnet.private-1.id, data.aws_subnet.private-2.id]
}

# Create Target Group
resource "aws_lb_target_group" "test-tg" {
  name     = "test-tg-${var.stage}"
  port     = var.container_port
  protocol = "HTTP"
  target_type = "ip"
  vpc_id   = data.aws_vpc.vpc.id
  health_check {
    enabled             = true
    healthy_threshold   = 2
    interval            = 90
    path                = "/"
    matcher             = "200-399"
    port                = var.container_port
    protocol            = "HTTP"
    timeout             = 40
    unhealthy_threshold = 2
}

}

# Create listener
resource "aws_lb_listener" "test-listener" {
  load_balancer_arn = aws_lb.test.arn
  port              = var.container_port
  protocol          = "HTTP"

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.test-tg.arn
  }
}

# IAM
resource "aws_iam_role" "ecs_task_execution_role" {
  name = "tf-${var.project}-${var.stage}-ecs-task-execution-role"
  assume_role_policy = data.aws_iam_policy_document.ecs_assume_role_policy.json
  inline_policy {
    name = "test-sdk-ecr-repo-policy"
    policy = jsonencode({
      "Version" : "2012-10-17",
      "Statement" : [
        {
          "Effect" : "Allow",
          "Action" : [
            "ecr:GetAuthorizationToken",
            "ecr:BatchCheckLayerAvailability",
            "ecr:GetDownloadUrlForLayer",
            "ecr:BatchGetImage",
            "logs:CreateLogGroup",
            "logs:DescribeLogGroups",
            "logs:DescribeLogStreams",
            "logs:CreateLogStream",
            "logs:PutLogEvents",
            "secretsmanager:GetSecretValue",
            "events:PutEvents"
          ],
          "Resource" : "*"
    }
}

data "aws_iam_policy_document" "ecs_assume_role_policy" {
  statement {
    actions = [
      "sts:AssumeRole"
    ]
    effect = "Allow"
    principals {
      type        = "Service"
      identifiers = ["ecs-tasks.amazonaws.com"]
    }
  }
}
amazon-ecs amazon-vpc aws-fargate amazon-ecr vpc-endpoint
1个回答
0
投票

由于 S3 是网关端点,因此它不会在 VPC 上创建网络接口。 因此,即使您的安全组允许流向您的 VPC,但如果不进行一些修改,它也无法获取图像(ECR 在幕后存储在 S3 中)。

正如您在评论中提到的,解决方案是将为 S3 创建的前缀列表 ID 添加到安全组。 本质上,这是将 S3 IP 地址添加为出站通信的允许列表。

本文档概述了详细信息:

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.