简体   繁体   中英

Terraform AWS EMR HBase cluster creation - application provisioning timed out

I use terraform to create an HBase cluster in AWS. When I use these settings a cluster is provisioned successfully:

resource "aws_emr_cluster" "hbase" {
  name          = "hbase"
  release_label = "emr-6.3.1"
  applications  = ["HBase"]


  termination_protection            = false
  keep_job_flow_alive_when_no_steps = true

  ec2_attributes {
    key_name  = <removed>
    subnet_id = <removed>
    
   instance_profile = aws_iam_instance_profile.emr_profile.arn
  }

  master_instance_group {
    instance_type  = "m1.medium"
    instance_count = "1"
  }

  core_instance_group {
    instance_type  = "m1.medium"
    instance_count = 4

    ebs_config {
      size                 = "20"
      type                 = "gp2"
      volumes_per_instance = 1
    }
  }

  ebs_root_volume_size = 10

As soon as I increase the number of master nodes to three, the cluster creation fails with the error message:

Error: Error waiting for EMR Cluster state to be “WAITING” or “RUNNING”: TERMINATING: BOOTSTRAP_FAILURE: On the master instance (i-), application provisioning timed out

I checked the documentation for aws_emr_cluster, but could not find anything to set a timeout.

I also checked the timeout settings for IAM roles, but the default setting is one hour which would be absolutely sufficient. https://docs.aws.amazon.com/en_en/IAM/latest/UserGuide/id_roles_use.html

I get the above mentioned error message every time cluster creation takes longer than about 16 minutes (16 minutes and 20 seconds according to the Terraform output).

I have also created an AWS MSK resource in the same project which took longer than 17 minutes. This finished successfully without complaining. So it does not seem like it is a global timeout value.

Any ideas would be much appreciated.

Btw:

terraform version
Terraform v1.1.2
on darwin_amd64
+ provider registry.terraform.io/hashicorp/aws v3.60.0

Best,
Denny

The issue has now been resolved. To keep the costs down for this (test) setup I chose instance type "m1.medium", turned out this was the problem. Using a bigger instance type solved it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM