Fail to init aws cluster (kubeadm init) with the message "could not init cloud provider "aws": error finding instance ... timeout

Question

The issue I have is that kubeadm will never fully initialize. The output:

...
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
...
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
...

and journalctl -xeu kubelet shows the following interesting info:

Dec 03 17:54:08 ip-10-83-62-10.ec2.internal kubelet[14709]: W1203 17:54:08.017925   14709 plugins.go:105] WARNING: aws built-in cloud provider is now deprecated. The AWS provider is deprecated. The AWS provider is deprecated and will be removed in a future release
Dec 03 17:54:08 ip-10-83-62-10.ec2.internal kubelet[14709]: I1203 17:54:08.018044   14709 aws.go:1235] Building AWS cloudprovider
Dec 03 17:54:08 ip-10-83-62-10.ec2.internal kubelet[14709]: I1203 17:54:08.018112   14709 aws.go:1195] Zone not specified in configuration file; querying AWS metadata service
Dec 03 17:56:08 ip-10-83-62-10.ec2.internal kubelet[14709]: F1203 17:56:08.332951   14709 server.go:265] failed to run Kubelet: could not init cloud provider "aws": error finding instance  i-03e00e9192370ca0d: "error listing AWS instances: \"RequestError: send request failed\\ncaused by: Post \\\"https://ec2.us-east-1.amazonaws.com/\\\": dial tcp 10.83.60.11:443: i/o timeout

The context is: it's a fully private AWS VPC. There is a proxy that is propagated to k8s manifests.

the kubeadm.yaml config is pretty innocent and looks like this

---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
apiServer:
  extraArgs:
    cloud-provider: aws
clusterName: cdspidr
controlPlaneEndpoint: ip-10-83-62-10.ec2.internal
controllerManager:
  extraArgs:
    cloud-provider: aws
    configure-cloud-routes: "false"
kubernetesVersion: stable
networking:
  dnsDomain: cluster.local
  podSubnet: 10.83.62.0/24
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
nodeRegistration:
  name: ip-10-83-62-10.ec2.internal
  kubeletExtraArgs:
    cloud-provider: was

I'm looking for help to figure out a couple of things here:

why does kubeadm use this address ( https://ec2.us-east-1.amazonaws.com ) to retrieve availability zones? It does not look correct. IMO, it should be something like http://169.254.169.254/latest/dynamic/instance-identity/document
why does it fail? With the same proxy settings, a curl request from the terminal returns the web page.
To workaround it, how can I specify availability zones on my own in kubeadm.yaml or via a command like for kubeadm?

I would appreciate any help or thoughts.

Answer 1

You can create a VPC endpoint for accessing Ec2 (service name - com.amazonaws.us-east-1.ec2), this will allow the kubelet to talk to Ec2 without internet and fetch the required info.

While creating the VPC endpoint please make sure to enable private DNS resolution option.

Also from the error it looks like that kubelet is trying to fetch the instance not just availability zone. ("aws": error finding instance i-03e00e9192370ca0d: "error listing AWS instances).

Fail to init aws cluster (kubeadm init) with the message "could not init cloud provider "aws": error finding instance ... timeout

Question

1 answers

solution1
1 2021-06-30 11:14:51

Fail to init aws cluster (kubeadm init) with the message "could not init cloud provider "aws": error finding instance ... timeout

Question

1 answers

solution1 1 2021-06-30 11:14:51

solution1
1 2021-06-30 11:14:51