我正在尝试在 AWS 中创建 EMR 集群。它运行良好,当我通过 UI 执行此操作时,集群已创建。
当我使用以下命令使用命令行创建它时,它也运行良好:
aws emr create-cluster \
--name "emr-cluster" \
--log-uri "s3n://aws-logs-433212232334-us-east-1/elasticmapreduce/" \
--release-label "emr-7.1.0" \
--service-role "arn:aws:iam::433212232334:role/service-role/AmazonEMR-ServiceRole-20240418T133040" \
--ec2-attributes '{"InstanceProfile":"AmazonEMR-InstanceProfile-20240418T133024","EmrManagedMasterSecurityGroup":"sg-07ced3d3223bet564","EmrManagedSlaveSecurityGroup":"sg-04b7742eed31ec137","KeyName":"emr-key-pair","AdditionalMasterSecurityGroups":[],"AdditionalSlaveSecurityGroups":[],"SubnetId":"subnet-0cf42a674f981a42a"}' \
--tags 'for-use-with-amazon-emr-managed-policies=true' \
--applications Name=Hadoop Name=Spark \
--configurations '[{"Classification":"spark-hive-site","Properties":{"hive.metastore.client.factory.class":"com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory"}}]' \
--instance-groups '[{"InstanceCount":1,"InstanceGroupType":"CORE","Name":"Core","InstanceType":"m5.xlarge","EbsConfiguration":{"EbsBlockDeviceConfigs":[{"VolumeSpecification":{"VolumeType":"gp2","SizeInGB":32},"VolumesPerInstance":2}]}},{"InstanceCount":1,"InstanceGroupType":"MASTER","Name":"Primary","InstanceType":"m5.xlarge","EbsConfiguration":{"EbsBlockDeviceConfigs":[{"VolumeSpecification":{"VolumeType":"gp2","SizeInGB":32},"VolumesPerInstance":2}]}}]' \
--scale-down-behavior "TERMINATE_AT_TASK_COMPLETION" \
--auto-termination-policy '{"IdleTimeout":3600}' \
--region "us-east-1"
但是当我使用 boto3 运行它时,集群终止并出现以下错误:
Service role arn:aws:iam::433212232334:role/service-role/AmazonEMR-ServiceRole-20240418T133040 has insufficient EC2 permissions
这是 boto3 代码:
import boto3
emr_client = boto3.client('emr')
response = emr_client.run_job_flow(
Name='emr-cluster',
LogUri='s3n://aws-logs-433212232334-us-east-1/elasticmapreduce/',
ReleaseLabel='emr-7.1.0',
ServiceRole="arn:aws:iam::433212232334:role/service-role/AmazonEMR-ServiceRole-20240418T133040",
JobFlowRole="AmazonEMR-InstanceProfile-20240418T133024",
ScaleDownBehavior='TERMINATE_AT_TASK_COMPLETION',
EbsRootVolumeSize=15,
AutoTerminationPolicy={
'IdleTimeout': 3600
},
Applications=[
{
'Name': 'Spark'
},
{
'Name': 'Hadoop'
}
],
Configurations=[
{
"Classification": "spark-hive-site",
"Properties": {
"hive.metastore.client.factory.class": "com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory"
}
}
],
Instances={
'KeepJobFlowAliveWhenNoSteps': True,
'Ec2KeyName': 'emr-key-pair',
'InstanceGroups': [
{
"InstanceCount": 1,
"InstanceRole": "CORE",
"Name": "Core",
"InstanceType": "m5.xlarge",
"EbsConfiguration": {
"EbsBlockDeviceConfigs": [
{
"VolumeSpecification": {
"VolumeType": "gp2",
"SizeInGB": 32
},
"VolumesPerInstance": 2
}
]
}
},
{
"InstanceCount": 1,
"InstanceRole": "MASTER",
"Name": "Primary",
"InstanceType": "m5.xlarge",
"EbsConfiguration": {
"EbsBlockDeviceConfigs": [
{
"VolumeSpecification": {
"VolumeType": "gp2",
"SizeInGB": 32
},
"VolumesPerInstance": 2
}
]
}
}
]
}
)
为什么当我在使用
service role
或 AWS Web Console
创建集群时使用相同的 AWS CLI
时,效果很好,但是当我使用 boto3
执行相同操作时,集群会抛出此错误,即 service role
没有足够的权限?
解决使用Boto3创建EMR集群时权限错误的问题:
对于实例配置文件:
通过使用正确的策略和信任关系设置这些角色,您应该避免“EC2 权限不足”错误。