[英]kubectl exec/logs on GKE returns "remote error: tls: internal error"
我目前在尝试为我的 GKE 集群上的 Pod 执行或获取日志时遇到错误。
$ kubectl logs <POD-NAME>
Error from server: Get "https://<NODE-PRIVATE-IP>:10250/containerLogs/default/<POD-NAME>/<DEPLOYMENT-NAME>": remote error: tls: internal error
$ kubectl exec -it <POD-NAME> -- sh
Error from server: error dialing backend: remote error: tls: internal error
我在排除故障时发现的一件可疑事情是所有 CSR 都被拒绝了......
$ kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
csr-79zkn 4m16s kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7b5sx 91m kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7fzjh 103m kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7gstl 19m kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7hrvm 11m kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7mn6h 87m kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7nd7h 4m57s kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
...
知道为什么会这样吗? 也许是防火墙问题?
提前致谢 !
这里使用冗长的 output --v=8
的相同命令,没有goroutines
堆栈跟踪
$ kubectl logs --v=8 <POD-NAME>
I0527 09:27:59.624843 10407 loader.go:375] Config loaded from file: /home/kevin/.kube/config
I0527 09:27:59.628621 10407 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>
I0527 09:27:59.628635 10407 round_trippers.go:427] Request Headers:
I0527 09:27:59.628644 10407 round_trippers.go:431] Accept: application/json, */*
I0527 09:27:59.628649 10407 round_trippers.go:431] User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:27:59.727411 10407 round_trippers.go:446] Response Status: 200 OK in 98 milliseconds
I0527 09:27:59.727461 10407 round_trippers.go:449] Response Headers:
I0527 09:27:59.727480 10407 round_trippers.go:452] Audit-Id: ...
I0527 09:27:59.727496 10407 round_trippers.go:452] Cache-Control: no-cache, private
I0527 09:27:59.727512 10407 round_trippers.go:452] Content-Type: application/json
I0527 09:27:59.727528 10407 round_trippers.go:452] Date: Thu, 27 May 2021 07:27:59 GMT
I0527 09:27:59.727756 10407 request.go:1097] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"<POD-NAME>","generateName":"<POD-BASE-NAME>","namespace":"default","selfLink":"/api/v1/namespaces/default/pods/<POD-NAME>","uid":"...","resourceVersion":"6764210","creationTimestamp":"2021-05-19T10:33:28Z","labels":{"app":"<NAME>","pod-template-hash":"..."},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"<POD-BASE-NAME>","uid":"...","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"kube-controller-manager","operation":"Update","apiVersion":"v1","time":"2021-05-19T10:33:28Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:app":{},"f:pod-template-hash":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"...\"}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:c [truncated 3250 chars]
I0527 09:27:59.745985 10407 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>/log
I0527 09:27:59.746035 10407 round_trippers.go:427] Request Headers:
I0527 09:27:59.746055 10407 round_trippers.go:431] Accept: application/json, */*
I0527 09:27:59.746071 10407 round_trippers.go:431] User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:27:59.800586 10407 round_trippers.go:446] Response Status: 500 Internal Server Error in 54 milliseconds
I0527 09:27:59.800638 10407 round_trippers.go:449] Response Headers:
I0527 09:27:59.800654 10407 round_trippers.go:452] Audit-Id: ...
I0527 09:27:59.800668 10407 round_trippers.go:452] Cache-Control: no-cache, private
I0527 09:27:59.800680 10407 round_trippers.go:452] Content-Type: application/json
I0527 09:27:59.800693 10407 round_trippers.go:452] Content-Length: 217
I0527 09:27:59.800712 10407 round_trippers.go:452] Date: Thu, 27 May 2021 07:27:59 GMT
I0527 09:27:59.800772 10407 request.go:1097] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Get \"https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>\": remote error: tls: internal error","code":500}
I0527 09:27:59.801848 10407 helpers.go:216] server response object: [{
"metadata": {},
"status": "Failure",
"message": "Get \"https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>\": remote error: tls: internal error",
"code": 500
}]
F0527 09:27:59.801944 10407 helpers.go:115] Error from server: Get "https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>": remote error: tls: internal error
kubectl exec --v=8 -it <POD-NAME> -- sh
I0527 09:44:48.673774 11157 loader.go:375] Config loaded from file: /home/kevin/.kube/config
I0527 09:44:48.678514 11157 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>
I0527 09:44:48.678528 11157 round_trippers.go:427] Request Headers:
I0527 09:44:48.678535 11157 round_trippers.go:431] Accept: application/json, */*
I0527 09:44:48.678543 11157 round_trippers.go:431] User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:44:48.795864 11157 round_trippers.go:446] Response Status: 200 OK in 117 milliseconds
I0527 09:44:48.795920 11157 round_trippers.go:449] Response Headers:
I0527 09:44:48.795963 11157 round_trippers.go:452] Audit-Id: ...
I0527 09:44:48.795995 11157 round_trippers.go:452] Cache-Control: no-cache, private
I0527 09:44:48.796019 11157 round_trippers.go:452] Content-Type: application/json
I0527 09:44:48.796037 11157 round_trippers.go:452] Date: Thu, 27 May 2021 07:44:48 GMT
I0527 09:44:48.796644 11157 request.go:1097] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"<POD-NAME>","generateName":"","namespace":"default","selfLink":"/api/v1/namespaces/default/pods/<POD-NAME>","uid":"","resourceVersion":"6764210","creationTimestamp":"2021-05-19T10:33:28Z","labels":{"app":"...","pod-template-hash":"..."},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"<POD-BASE-NAME>","uid":"...","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"kube-controller-manager","operation":"Update","apiVersion":"v1","time":"2021-05-19T10:33:28Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:app":{},"f:pod-template-hash":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"...\"}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:c [truncated 3250 chars]
I0527 09:44:48.814315 11157 round_trippers.go:420] POST https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>/exec?command=sh&container=<SERVICE-NAME>&stdin=true&stdout=true&tty=true
I0527 09:44:48.814372 11157 round_trippers.go:427] Request Headers:
I0527 09:44:48.814391 11157 round_trippers.go:431] User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:44:48.814406 11157 round_trippers.go:431] X-Stream-Protocol-Version: v4.channel.k8s.io
I0527 09:44:48.814420 11157 round_trippers.go:431] X-Stream-Protocol-Version: v3.channel.k8s.io
I0527 09:44:48.814445 11157 round_trippers.go:431] X-Stream-Protocol-Version: v2.channel.k8s.io
I0527 09:44:48.814471 11157 round_trippers.go:431] X-Stream-Protocol-Version: channel.k8s.io
I0527 09:44:48.913928 11157 round_trippers.go:446] Response Status: 500 Internal Server Error in 99 milliseconds
I0527 09:44:48.913977 11157 round_trippers.go:449] Response Headers:
I0527 09:44:48.914005 11157 round_trippers.go:452] Audit-Id: ...
I0527 09:44:48.914029 11157 round_trippers.go:452] Cache-Control: no-cache, private
I0527 09:44:48.914054 11157 round_trippers.go:452] Content-Type: application/json
I0527 09:44:48.914077 11157 round_trippers.go:452] Date: Thu, 27 May 2021 07:44:48 GMT
I0527 09:44:48.914099 11157 round_trippers.go:452] Content-Length: 149
I0527 09:44:48.915741 11157 helpers.go:216] server response object: [{
"metadata": {},
"status": "Failure",
"message": "error dialing backend: remote error: tls: internal error",
"code": 500
}]
F0527 09:44:48.915837 11157 helpers.go:115] Error from server: error dialing backend: remote error: tls: internal error
连接到其中一个 GKE 工作节点并检查kubelet
日志后,我发现了这些有线线路
May 27 09:30:11 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:11.271022 1272 log.go:181] http: TLS handshake error from 10.156.0.9:54672: no serving certificate available for the kubelet
May 27 09:30:11 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:11.305628 1272 log.go:181] http: TLS handshake error from 10.156.0.9:54674: no serving certificate available for the kubelet
May 27 09:30:12 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:12.067998 1272 log.go:181] http: TLS handshake error from 10.156.0.11:57610: no serving certificate available for the kubelet
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.144826 1272 certificate_manager.go:412] Rotating certificates
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.154322 1272 reflector.go:207] Starting reflector *v1.CertificateSigningRequest (0s) from k8s.io/client-go/tools/watch/informerwatcher.go:146
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.448976 1272 reflector.go:213] Stopping reflector *v1.CertificateSigningRequest (0s) from k8s.io/client-go/tools/watch/informerwatcher.go:146
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: E0527 09:30:14.449045 1272 certificate_manager.go:454] certificate request was not signed: cannot watch on the certificate signing request: certificate signing request is denied, reason: AutoDenied, message:
我已将集群版本从1.19.9-gke.1400
更新到1.19.9-gke.1900
。 没有解决问题...
在集群上进行Credentials Rotation
。 但是也没有解决...
在集群中尝试大量更改后:
即使创建一个新集群(在同一个项目上,使用相同的 VPC 等)也没有解决问题......
此问题可能与对防火墙规则所做的更改有关。
仅找到解决方案,在新 GCP 项目中创建新 GKE 集群并使用 Velero 迁移工作流程。
假设这个答案可能有帮助,
*此问题是由于在每个节点中运行的 kubelet 对节点发出的未决证书签名请求引起的 *
检查节点的待定 CSR
kubectl get csr --sort-by=.metadata.creationTimestamp
然后批准每个节点的 csr
kubectl certificate approve <csr-id>
更多详细信息,请参阅 k8s 文档中的此部分
https://kube.netes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/#kubelet-serving-certs
一个已知的限制是这些证书的 CSR(证书签名请求)不能被 kube-controller-manager - kube.netes.io/kubelet-serving 中的默认签名者自动批准。 这将需要用户或第三方采取行动 controller
.
请批准您的证书。 对于 kuber.nets/openshift。 openshift审批证书的解决方案
1- 检查 csr 并检查条件是否未决
$ oc get csr
2-批准证书
$ oc get csr -o name | xargs oc adm certificate approve
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.