简体   繁体   English

无法初始化 aws 集群 (kubeadm init) 并显示消息“无法初始化云提供商“aws”:错误查找实例...超时

[英]Fail to init aws cluster (kubeadm init) with the message "could not init cloud provider "aws": error finding instance ... timeout

The issue I have is that kubeadm will never fully initialize.我遇到的问题是kubeadm永远不会完全初始化。 The output: output:

...
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
...
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
...

and journalctl -xeu kubelet shows the following interesting info: journalctl -xeu kubelet显示了以下有趣的信息:

Dec 03 17:54:08 ip-10-83-62-10.ec2.internal kubelet[14709]: W1203 17:54:08.017925   14709 plugins.go:105] WARNING: aws built-in cloud provider is now deprecated. The AWS provider is deprecated. The AWS provider is deprecated and will be removed in a future release
Dec 03 17:54:08 ip-10-83-62-10.ec2.internal kubelet[14709]: I1203 17:54:08.018044   14709 aws.go:1235] Building AWS cloudprovider
Dec 03 17:54:08 ip-10-83-62-10.ec2.internal kubelet[14709]: I1203 17:54:08.018112   14709 aws.go:1195] Zone not specified in configuration file; querying AWS metadata service
Dec 03 17:56:08 ip-10-83-62-10.ec2.internal kubelet[14709]: F1203 17:56:08.332951   14709 server.go:265] failed to run Kubelet: could not init cloud provider "aws": error finding instance  i-03e00e9192370ca0d: "error listing AWS instances: \"RequestError: send request failed\\ncaused by: Post \\\"https://ec2.us-east-1.amazonaws.com/\\\": dial tcp 10.83.60.11:443: i/o timeout

The context is: it's a fully private AWS VPC.上下文是:它是一个完全私有的 AWS VPC。 There is a proxy that is propagated to k8s manifests.有一个代理传播到 k8s 清单。

the kubeadm.yaml config is pretty innocent and looks like this kubeadm.yaml 配置非常无辜,看起来像这样

---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
apiServer:
  extraArgs:
    cloud-provider: aws
clusterName: cdspidr
controlPlaneEndpoint: ip-10-83-62-10.ec2.internal
controllerManager:
  extraArgs:
    cloud-provider: aws
    configure-cloud-routes: "false"
kubernetesVersion: stable
networking:
  dnsDomain: cluster.local
  podSubnet: 10.83.62.0/24
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
nodeRegistration:
  name: ip-10-83-62-10.ec2.internal
  kubeletExtraArgs:
    cloud-provider: was

I'm looking for help to figure out a couple of things here:我正在寻求帮助以解决以下问题:

  1. why does kubeadm use this address ( https://ec2.us-east-1.amazonaws.com ) to retrieve availability zones?为什么 kubeadm 使用此地址 ( https://ec2.us-east-1.amazonaws.com ) 来检索可用区? It does not look correct.它看起来不正确。 IMO, it should be something like http://169.254.169.254/latest/dynamic/instance-identity/document IMO,它应该类似于http://169.254.169.254/latest/dynamic/instance-identity/document

  2. why does it fail?为什么会失败? With the same proxy settings, a curl request from the terminal returns the web page.使用相同的代理设置,来自终端的 curl 请求返回 web 页面。

  3. To workaround it, how can I specify availability zones on my own in kubeadm.yaml or via a command like for kubeadm?要解决此问题,我如何在 kubeadm.yaml 中或通过类似 kubeadm 的命令自行指定可用区?

I would appreciate any help or thoughts.我将不胜感激任何帮助或想法。

You can create a VPC endpoint for accessing Ec2 (service name - com.amazonaws.us-east-1.ec2), this will allow the kubelet to talk to Ec2 without internet and fetch the required info.您可以创建一个用于访问 Ec2 的 VPC 端点(服务名称 - com.amazonaws.us-east-1.ec2),这将允许 kubelet 在没有互联网的情况下与 Ec2 对话并获取所需的信息。

While creating the VPC endpoint please make sure to enable private DNS resolution option.创建 VPC 端点时,请确保启用私有 DNS 分辨率选项。

Also from the error it looks like that kubelet is trying to fetch the instance not just availability zone.同样从错误来看,kubelet 正在尝试获取实例而不仅仅是可用区。 ("aws": error finding instance i-03e00e9192370ca0d: "error listing AWS instances). (“aws”:错误查找实例 i-03e00e9192370ca0d:“错误列出 AWS 实例)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM