用于将数据从 Redshift 导出到 s3 的长时间运行的胶水作业因
S3ServiceException:The provided token has expired
而失败。 Amazon 描述了使用自定义角色作为解决方法(here)。但他们没有提供任何例子。有人可以提供一个云信息片段吗?一个角色应该是什么样子?如果我使用胶水作业,我是否应该将 dynamodb 或 EMR 集群的操作添加到角色策略中?
我列出了可以帮助您的不同文档:
AWS Glue 权限:https://docs.aws.amazon.com/glue/latest/dg/permissions.html
Amazon Web 服务的 IAM 角色:https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies.html
AWS RedShift 权限:https://docs.aws.amazon.com/redshift/latest/mgmt/grant-privileges.html
S3 权限:https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-with-s3-actions.html
Dynamo IAM 权限:https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/iam-policy-examples.html
EMR IAM 权限:https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-iam-roles.html
还有 CloudFormation 中的配置示例,我让您尝试一下并在必要时进行调整:
Resources:
GlueJobRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: "Allow"
Principal:
Service: "glue.amazonaws.com"
Action: "sts:AssumeRole"
Policies:
- PolicyName: "GlueJobS3Policy"
PolicyDocument:
Version: "2012-10-17"
Statement:
# Permissions for Redshift
- Effect: "Allow"
Action:
- "redshift:DescribeClusters"
- "redshift:CopyFromS3"
- "redshift:Select"
Resource: "*"
# Permissions for S3
- Effect: "Allow"
Action:
- "s3:GetObject"
- "s3:PutObject"
- "s3:ListBucket"
- "s3:ListBucketMultipartUploads"
Resource:
- "arn:aws:s3:::your-s3-bucket-name/*"
- "arn:aws:s3:::your-s3-bucket-name"
# Permissions for Glue resources
- Effect: "Allow"
Action:
- "glue:GetTable"
- "glue:GetTableVersion"
- "glue:GetTableVersions"
- "glue:GetDatabase"
- "glue:GetPartitions"
- "glue:BatchGetPartition"
- "glue:CreateJob"
- "glue:GetJob"
- "glue:UpdateJob"
- "glue:StartJobRun"
- "glue:GetJobRun"
Resource: "*"
# Permissions for DynamoDB (optional)
- Effect: "Allow"
Action:
- "dynamodb:Scan"
- "dynamodb:Query"
Resource: "*"
# Permissions for EMR (optional)
- Effect: "Allow"
Action:
- "elasticmapreduce:ListClusters"
- "elasticmapreduce:DescribeCluster"
- "elasticmapreduce:DescribeStep"
Resource: "*"