簡體   English   中英

使用 terraform 部署 EKS 節點組時出錯

[英]Error deploying EKS node-group with terraform

我在使用 Terraform 部署 EKS 集群中的節點組時遇到問題。 該錯誤看起來像是一個插件有問題,但我不知道如何解決。

如果我在 AWS 控制台 (web) 中看到 EC2,我可以看到集群實例,但集群中出現此錯誤。

錯誤顯示在我的管道中:

錯誤:等待 EKS 節點組 (UNIR-API-REST-CLUSTER-DEV:node_sping_boot) 創建:NodeCreationFailure:實例無法加入 kubernetes 集群。 資源 ID:[i-05ed58f8101240dc8]
在 EKS.tf 第 17 行,在資源“aws_eks_node_group”“節點”中:
17:資源“aws_eks_node_group”“節點”
2020-06-01T00:03:50.576Z [DEBUG] 插件:插件進程退出:path=/home/ubuntu/.jenkins/workspace/shop_infraestucture_generator_pipline/shop-proyect-dev/.terraform/plugins/linux_amd64/terraform-provider- aws_v2.64.0_x4 pid=13475
2020-06-01T00:03:50.576Z [DEBUG] 插件:插件退出

並且錯誤打印在AWS 控制台中:

關聯

這是我用來創建項目的 Terraform 中的代碼:

EKS.tf用於創建集群和去節點

resource "aws_eks_cluster" "CLUSTER" {
  name     = "UNIR-API-REST-CLUSTER-${var.SUFFIX}"
  role_arn = "${aws_iam_role.eks_cluster_role.arn}"
  vpc_config {
    subnet_ids = [
      "${aws_subnet.unir_subnet_cluster_1.id}","${aws_subnet.unir_subnet_cluster_2.id}"
    ]
  }
  depends_on = [
    "aws_iam_role_policy_attachment.AmazonEKSWorkerNodePolicy",
    "aws_iam_role_policy_attachment.AmazonEKS_CNI_Policy",
    "aws_iam_role_policy_attachment.AmazonEC2ContainerRegistryReadOnly",
  ]
}


resource "aws_eks_node_group" "nodes" {
  cluster_name    = "${aws_eks_cluster.CLUSTER.name}"
  node_group_name = "node_sping_boot"
  node_role_arn   = "${aws_iam_role.eks_nodes_role.arn}"
  subnet_ids      = [
      "${aws_subnet.unir_subnet_cluster_1.id}","${aws_subnet.unir_subnet_cluster_2.id}"
  ]
  scaling_config {
    desired_size = 1
    max_size     = 5
    min_size     = 1
  }
# instance_types is mediumt3 by default
# Ensure that IAM Role permissions are created before and deleted after EKS Node Group handling.
# Otherwise, EKS will not be able to properly delete EC2 Instances and Elastic Network Interfaces.
  depends_on = [
    "aws_iam_role_policy_attachment.AmazonEKSWorkerNodePolicy",
    "aws_iam_role_policy_attachment.AmazonEKS_CNI_Policy",
    "aws_iam_role_policy_attachment.AmazonEC2ContainerRegistryReadOnly",
  ]
}

output "eks_cluster_endpoint" {
  value = "${aws_eks_cluster.CLUSTER.endpoint}"
}

output "eks_cluster_certificat_authority" {
    value = "${aws_eks_cluster.CLUSTER.certificate_authority}"
}

securityAndGroups.tf

resource "aws_iam_role" "eks_cluster_role" {
  name = "eks-cluster-${var.SUFFIX}"

  assume_role_policy = <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "eks.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
POLICY
}


resource "aws_iam_role" "eks_nodes_role" {
  name = "eks-node-${var.SUFFIX}"

  assume_role_policy = <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
POLICY
}


resource "aws_iam_role_policy_attachment" "AmazonEKSClusterPolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
  role       = "${aws_iam_role.eks_cluster_role.name}"
}

resource "aws_iam_role_policy_attachment" "AmazonEKSServicePolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSServicePolicy"
  role       = "${aws_iam_role.eks_cluster_role.name}"
}

resource "aws_iam_role_policy_attachment" "AmazonEKSWorkerNodePolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
  role       = "${aws_iam_role.eks_nodes_role.name}"
}

resource "aws_iam_role_policy_attachment" "AmazonEKS_CNI_Policy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
  role       = "${aws_iam_role.eks_nodes_role.name}"
}

resource "aws_iam_role_policy_attachment" "AmazonEC2ContainerRegistryReadOnly" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
  role       = "${aws_iam_role.eks_nodes_role.name}"
}

VPCAndRouting.tf用於創建我的路由、VPC 和子網

resource "aws_vpc" "unir_shop_vpc_dev" {
  cidr_block = "${var.NET_CIDR_BLOCK}"
  enable_dns_hostnames = true
  enable_dns_support = true
  tags = {
    Name = "UNIR-VPC-SHOP-${var.SUFFIX}"
    Environment = "${var.SUFFIX}"
  }
}
resource "aws_route_table" "route" {
  vpc_id = "${aws_vpc.unir_shop_vpc_dev.id}"
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = "${aws_internet_gateway.unir_gat_shop_dev.id}"
  }
  tags = {
    Name = "UNIR-RoutePublic-${var.SUFFIX}"
    Environment = "${var.SUFFIX}"
  }
}

data "aws_availability_zones" "available" {
  state = "available"
}
resource "aws_subnet" "unir_subnet_aplications" {
  vpc_id = "${aws_vpc.unir_shop_vpc_dev.id}"
  cidr_block = "${var.SUBNET_CIDR_APLICATIONS}"
  availability_zone = "${var.ZONE_SUB}"
  depends_on = ["aws_internet_gateway.unir_gat_shop_dev"]
  map_public_ip_on_launch = true
  tags = {
    Name = "UNIR-SUBNET-APLICATIONS-${var.SUFFIX}"
    Environment = "${var.SUFFIX}"
  }
}

resource "aws_subnet" "unir_subnet_cluster_1" {
  vpc_id = "${aws_vpc.unir_shop_vpc_dev.id}"
  cidr_block = "${var.SUBNET_CIDR_CLUSTER_1}"
  map_public_ip_on_launch = true
  availability_zone = "${var.ZONE_SUB_CLUSTER_2}"
  tags = {
    "kubernetes.io/cluster/UNIR-API-REST-CLUSTER-${var.SUFFIX}" = "shared"
  }
}

resource "aws_subnet" "unir_subnet_cluster_2" {
  vpc_id = "${aws_vpc.unir_shop_vpc_dev.id}"
  cidr_block = "${var.SUBNET_CIDR_CLUSTER_2}"
  availability_zone = "${var.ZONE_SUB_CLUSTER_1}"
  map_public_ip_on_launch = true
  tags = {
    "kubernetes.io/cluster/UNIR-API-REST-CLUSTER-${var.SUFFIX}" = "shared"
  }

}

resource "aws_internet_gateway" "unir_gat_shop_dev" {
  vpc_id = "${aws_vpc.unir_shop_vpc_dev.id}"
  tags = {
    Environment = "${var.SUFFIX}"
    Name = "UNIR-publicGateway-${var.SUFFIX}"
  }
}

我的變量:

SUFFIX="DEV"
ZONE="eu-west-1"
TERRAFORM_USER_ID=
TERRAFORM_USER_PASS=
ZONE_SUB="eu-west-1b"
ZONE_SUB_CLUSTER_1="eu-west-1a"
ZONE_SUB_CLUSTER_2="eu-west-1c"
NET_CIDR_BLOCK="172.15.0.0/24"
SUBNET_CIDR_APLICATIONS="172.15.0.0/27"
SUBNET_CIDR_CLUSTER_1="172.15.0.32/27"
SUBNET_CIDR_CLUSTER_2="172.15.0.64/27"
SUBNET_CIDR_CLUSTER_3="172.15.0.128/27"
SUBNET_CIDR_CLUSTER_4="172.15.0.160/27"
SUBNET_CIDR_CLUSTER_5="172.15.0.192/27"
SUBNET_CIDR_CLUSTER_6="172.15.0.224/27"
MONGO_SSH_KEY=
KIBANA_SSH_KEY=
CLUSTER_SSH_KEY=

需要更多日志嗎?

還有一種快速解決此類問題的方法。 您可以使用“AWSSupport-TroubleshootEKSWorkerNode”運行手冊。 此 Runbook 旨在幫助對未能加入 EKS 集群的 EKS 工作程序節點進行故障排除。 您需要轉到 AWS Systems Manager -> Automation -> 選擇 Runbook -> 使用 ClusterName 和 Instance-id 執行 Runbook。

這對故障排除非常有幫助,並在執行結束時提供了一個很好的總結。

您也可以在此處參考文檔。

根據 AWS 文檔

如果您在 AWS 管理控制台中收到錯誤“Instances failed to join the kubernetes cluster”,請確保已啟用集群的私有終端節點訪問,或者您已正確配置 CIDR 塊以進行公共終端節點訪問。 有關更多信息,請參閱 Amazon EKS 集群終端節點訪問控制。

我注意到您正在切換子網的可用區:

resource "aws_subnet" "unir_subnet_cluster_1" {
  vpc_id = "${aws_vpc.unir_shop_vpc_dev.id}"
  cidr_block = "${var.SUBNET_CIDR_CLUSTER_1}"
  map_public_ip_on_launch = true
  availability_zone = "${var.ZONE_SUB_CLUSTER_2}"

您已將var.ZONE_SUB_CLUSTER_2分配給unir_subnet_cluster_1並將var.ZONE_SUB_CLUSTER_1分配給unir_subnet_cluster_2 也許這可能是配置錯誤的原因。

如“NodeCreationFailure”下所述, 錯誤有兩個可能的原因:

NodeCreationFailure :您啟動的實例無法注冊到您的 Amazon EKS 集群。 此故障的常見原因是節點 IAM 角色權限不足或節點缺少出站 Internet 訪問權限
您的節點必須能夠使用公共 IP 地址正確訪問互聯網到 function。

在我的情況下,集群位於私有子網內,在添加到 NAT 網關的路由后,錯誤消失了。

我遇到了同樣的問題,通過創建另一個 nat 網關並選擇公共子網並將新的 nat 網關以可路由方式連接到我的私有子網來解決。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM