[英]"unable to retrieve the complete list of server APIs: tap.linkerd.io/v1alpha1" error using Linkerd on private cluster in GKE
Why does the following error occur when I install Linkerd 2.x
on a private cluster in GKE?为什么我在GKE的私有集群上安装Linkerd 2.x
会出现如下错误?
Error: could not get apiVersions from Kubernetes: unable to retrieve the complete list of server APIs: tap.linkerd.io/v1alpha1: the server is currently unable to handle the request
The default firewall rules of a private cluster on GKE only permit traffic on ports 443
and 10250
. GKE 上私有集群的默认防火墙规则仅允许端口443
和10250
上的流量。 This allows communication to the kube-apiserver
and kubelet
, respectively.这允许分别与kube-apiserver
和kubelet
进行通信。
Linkerd
uses ports 8443
and 8089
for communication between the control and the proxies deployed to the data plane . Linkerd
使用端口8443
和8089
在控制和部署到数据平面的代理之间进行通信。
The tap component uses port 8089
to handle requests to its apiserver
. tap 组件使用端口8089
处理对其apiserver
的请求。
The proxy injector and service profile validator components, both of which are types of admission controllers , use port 8443
to handle requests. 代理注入器和服务配置文件验证器组件都是准入控制器的类型,使用端口8443
来处理请求。
The Linkerd 2 docs include instructions for configuring your firewall on a GKE private cluster: https://linkerd.io/2/reference/cluster-configuration/ Linkerd 2 文档包括在 GKE 私有集群上配置防火墙的说明: https://linkerd.io/2/reference/cluster-configuration/
They are included below:它们包括在下面:
Get the cluster name:获取集群名称:
CLUSTER_NAME=your-cluster-name
gcloud config set compute/zone your-zone-or-region
Get the cluster MASTER_IPV4_CIDR:获取集群 MASTER_IPV4_CIDR:
MASTER_IPV4_CIDR=$(gcloud container clusters describe $CLUSTER_NAME \
| grep "masterIpv4CidrBlock: " \
| awk '{print $2}')
Get the cluster NETWORK:获取集群网络:
NETWORK=$(gcloud container clusters describe $CLUSTER_NAME \
| grep "^network: " \
| awk '{print $2}')
Get the cluster auto-generated NETWORK_TARGET_TAG:获取集群自动生成的 NETWORK_TARGET_TAG:
NETWORK_TARGET_TAG=$(gcloud compute firewall-rules list \
--filter network=$NETWORK --format json \
| jq ".[] | select(.name | contains(\"$CLUSTER_NAME\"))" \
| jq -r '.targetTags[0]' | head -1)
Verify the values:验证值:
echo $MASTER_IPV4_CIDR $NETWORK $NETWORK_TARGET_TAG
# example output
10.0.0.0/28 foo-network gke-foo-cluster-c1ecba83-node
Create the firewall rules for proxy-injector and tap:为代理注入器创建防火墙规则并点击:
gcloud compute firewall-rules create gke-to-linkerd-control-plane \
--network "$NETWORK" \
--allow "tcp:8443,tcp:8089" \
--source-ranges "$MASTER_IPV4_CIDR" \
--target-tags "$NETWORK_TARGET_TAG" \
--priority 1000 \
--description "Allow traffic on ports 8843, 8089 for linkerd control-plane components"
Finally, verify that the firewall is created:最后,验证防火墙是否已创建:
gcloud compute firewall-rules describe gke-to-linkerd-control-plane
Solution:解决方案:
The steps I followed are:我遵循的步骤是:
kubectl get apiservices
: If linkered apiservice is down with the error CrashLoopBackOff try to follow the step 2 otherwise just try to restart the linkered service using kubectl delete apiservice/"service_name". kubectl get apiservices
:如果链接 apiservice 因错误 CrashLoopBackOff 而关闭,请尝试按照步骤 2 进行操作,否则只需尝试使用 kubectl delete apiservice/"service_name" 重新启动链接服务。 For me it was v1alpha1.tap.linkerd.io.对我来说,它是 v1alpha1.tap.linkerd.io。
kubectl get pods -n kube-system
and found out that pods like metrics-server, linkered, kubernetes-dashboard are down because of the main coreDNS pod was down. kubectl get pods -n kube-system
并发现诸如 metrics-server、linkered、kubernetes-dashboard 等 pod 已关闭,因为主 coreDNS pod 已关闭。
For me it was:对我来说是:
NAME READY STATUS RESTARTS AGE
pod/coredns-85577b65b-zj2x2 0/1 CrashLoopBackOff 7 13m
/etc/coredns/Corefile:10 - Error during parsing: Unknown directive proxy
, then we need to use forward instead of proxy in the yaml file where coreDNS config is there.使用 kubectl describe pod/"pod_name" 检查 coreDNS pod 中的错误,如果它因为/etc/coredns/Corefile:10 - Error during parsing: Unknown directive proxy
,那么我们需要在yaml 文件,其中有 coreDNS 配置。 Because CoreDNS version 1.5x used by the image does not support the proxy keyword anymore.因为镜像使用的 CoreDNS 版本 1.5x 不再支持 proxy 关键字了。This was a linkerd issue for me.这对我来说是一个链接问题。 To diagnose any linkerd related issues, you can use the linkerd CLI and run linkerd check
this should show you if there is an issue with linkerd and links on instructions to fix it.要诊断任何与 linkerd 相关的问题,您可以使用 linkerd CLI 并运行linkerd check
这应该会显示 linkerd 是否存在问题以及修复它的说明链接。
For me, the issue was that linkerd root certs had expired.对我来说,问题是 linkerd 根证书已经过期。 In my case, linkerd was experimental in a dev cluster so I removed it.在我的例子中,linkerd 在开发集群中是实验性的,所以我删除了它。 However, if you need to update your certificates you can follow the instructions at the following link.但是,如果您需要更新证书,可以按照以下链接中的说明进行操作。
https://linkerd.io/2.11/tasks/replacing_expired_certificates/ https://linkerd.io/2.11/tasks/replacing_expired_certificates/
Thanks to https://stackoverflow.com/a/59644120/1212371 I was put on the right path.感谢https://stackoverflow.com/a/59644120/1212371我走上了正确的道路。
In my case, it was related to linkerd/linkerd2#3497 , when the Linkerd service had some internal problems and couldn't respond back to the API service requests.在我的情况下,它与linkerd/linkerd2#3497相关,当 Linkerd 服务出现一些内部问题并且无法响应 API 服务请求时。 Fixed by restarting its pods.通过重新启动其 pod 来修复。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.