繁体   English   中英

GKE 上的 kubectl exec/logs 返回“远程错误:tls:内部错误”

[英]kubectl exec/logs on GKE returns "remote error: tls: internal error"

我目前在尝试为我的 GKE 集群上的 Pod 执行或获取日志时遇到错误。

$ kubectl logs <POD-NAME>
Error from server: Get "https://<NODE-PRIVATE-IP>:10250/containerLogs/default/<POD-NAME>/<DEPLOYMENT-NAME>": remote error: tls: internal error
$ kubectl exec -it <POD-NAME> -- sh
Error from server: error dialing backend: remote error: tls: internal error

我在排除故障时发现的一件可疑事情是所有 CSR 都被拒绝了......

$ kubectl get csr
NAME        AGE     SIGNERNAME                      REQUESTOR                 CONDITION
csr-79zkn   4m16s   kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7b5sx   91m     kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7fzjh   103m    kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7gstl   19m     kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7hrvm   11m     kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7mn6h   87m     kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7nd7h   4m57s   kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
...

知道为什么会这样吗? 也许是防火墙问题?

提前致谢 !

更新 1

这里使用冗长的 output --v=8的相同命令,没有goroutines堆栈跟踪

$ kubectl logs --v=8 <POD-NAME>

I0527 09:27:59.624843   10407 loader.go:375] Config loaded from file:  /home/kevin/.kube/config
I0527 09:27:59.628621   10407 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>
I0527 09:27:59.628635   10407 round_trippers.go:427] Request Headers:
I0527 09:27:59.628644   10407 round_trippers.go:431]     Accept: application/json, */*
I0527 09:27:59.628649   10407 round_trippers.go:431]     User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:27:59.727411   10407 round_trippers.go:446] Response Status: 200 OK in 98 milliseconds
I0527 09:27:59.727461   10407 round_trippers.go:449] Response Headers:
I0527 09:27:59.727480   10407 round_trippers.go:452]     Audit-Id: ...
I0527 09:27:59.727496   10407 round_trippers.go:452]     Cache-Control: no-cache, private
I0527 09:27:59.727512   10407 round_trippers.go:452]     Content-Type: application/json
I0527 09:27:59.727528   10407 round_trippers.go:452]     Date: Thu, 27 May 2021 07:27:59 GMT
I0527 09:27:59.727756   10407 request.go:1097] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"<POD-NAME>","generateName":"<POD-BASE-NAME>","namespace":"default","selfLink":"/api/v1/namespaces/default/pods/<POD-NAME>","uid":"...","resourceVersion":"6764210","creationTimestamp":"2021-05-19T10:33:28Z","labels":{"app":"<NAME>","pod-template-hash":"..."},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"<POD-BASE-NAME>","uid":"...","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"kube-controller-manager","operation":"Update","apiVersion":"v1","time":"2021-05-19T10:33:28Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:app":{},"f:pod-template-hash":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"...\"}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:c [truncated 3250 chars]
I0527 09:27:59.745985   10407 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>/log
I0527 09:27:59.746035   10407 round_trippers.go:427] Request Headers:
I0527 09:27:59.746055   10407 round_trippers.go:431]     Accept: application/json, */*
I0527 09:27:59.746071   10407 round_trippers.go:431]     User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:27:59.800586   10407 round_trippers.go:446] Response Status: 500 Internal Server Error in 54 milliseconds
I0527 09:27:59.800638   10407 round_trippers.go:449] Response Headers:
I0527 09:27:59.800654   10407 round_trippers.go:452]     Audit-Id: ...
I0527 09:27:59.800668   10407 round_trippers.go:452]     Cache-Control: no-cache, private
I0527 09:27:59.800680   10407 round_trippers.go:452]     Content-Type: application/json
I0527 09:27:59.800693   10407 round_trippers.go:452]     Content-Length: 217
I0527 09:27:59.800712   10407 round_trippers.go:452]     Date: Thu, 27 May 2021 07:27:59 GMT
I0527 09:27:59.800772   10407 request.go:1097] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Get \"https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>\": remote error: tls: internal error","code":500}
I0527 09:27:59.801848   10407 helpers.go:216] server response object: [{
  "metadata": {},
  "status": "Failure",
  "message": "Get \"https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>\": remote error: tls: internal error",
  "code": 500
}]
F0527 09:27:59.801944   10407 helpers.go:115] Error from server: Get "https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>": remote error: tls: internal error

kubectl exec --v=8 -it <POD-NAME> -- sh

I0527 09:44:48.673774   11157 loader.go:375] Config loaded from file:  /home/kevin/.kube/config
I0527 09:44:48.678514   11157 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>
I0527 09:44:48.678528   11157 round_trippers.go:427] Request Headers:
I0527 09:44:48.678535   11157 round_trippers.go:431]     Accept: application/json, */*
I0527 09:44:48.678543   11157 round_trippers.go:431]     User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:44:48.795864   11157 round_trippers.go:446] Response Status: 200 OK in 117 milliseconds
I0527 09:44:48.795920   11157 round_trippers.go:449] Response Headers:
I0527 09:44:48.795963   11157 round_trippers.go:452]     Audit-Id: ...
I0527 09:44:48.795995   11157 round_trippers.go:452]     Cache-Control: no-cache, private
I0527 09:44:48.796019   11157 round_trippers.go:452]     Content-Type: application/json
I0527 09:44:48.796037   11157 round_trippers.go:452]     Date: Thu, 27 May 2021 07:44:48 GMT
I0527 09:44:48.796644   11157 request.go:1097] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"<POD-NAME>","generateName":"","namespace":"default","selfLink":"/api/v1/namespaces/default/pods/<POD-NAME>","uid":"","resourceVersion":"6764210","creationTimestamp":"2021-05-19T10:33:28Z","labels":{"app":"...","pod-template-hash":"..."},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"<POD-BASE-NAME>","uid":"...","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"kube-controller-manager","operation":"Update","apiVersion":"v1","time":"2021-05-19T10:33:28Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:app":{},"f:pod-template-hash":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"...\"}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:c [truncated 3250 chars]
I0527 09:44:48.814315   11157 round_trippers.go:420] POST https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>/exec?command=sh&container=<SERVICE-NAME>&stdin=true&stdout=true&tty=true
I0527 09:44:48.814372   11157 round_trippers.go:427] Request Headers:
I0527 09:44:48.814391   11157 round_trippers.go:431]     User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:44:48.814406   11157 round_trippers.go:431]     X-Stream-Protocol-Version: v4.channel.k8s.io
I0527 09:44:48.814420   11157 round_trippers.go:431]     X-Stream-Protocol-Version: v3.channel.k8s.io
I0527 09:44:48.814445   11157 round_trippers.go:431]     X-Stream-Protocol-Version: v2.channel.k8s.io
I0527 09:44:48.814471   11157 round_trippers.go:431]     X-Stream-Protocol-Version: channel.k8s.io
I0527 09:44:48.913928   11157 round_trippers.go:446] Response Status: 500 Internal Server Error in 99 milliseconds
I0527 09:44:48.913977   11157 round_trippers.go:449] Response Headers:
I0527 09:44:48.914005   11157 round_trippers.go:452]     Audit-Id: ...
I0527 09:44:48.914029   11157 round_trippers.go:452]     Cache-Control: no-cache, private
I0527 09:44:48.914054   11157 round_trippers.go:452]     Content-Type: application/json
I0527 09:44:48.914077   11157 round_trippers.go:452]     Date: Thu, 27 May 2021 07:44:48 GMT
I0527 09:44:48.914099   11157 round_trippers.go:452]     Content-Length: 149
I0527 09:44:48.915741   11157 helpers.go:216] server response object: [{
  "metadata": {},
  "status": "Failure",
  "message": "error dialing backend: remote error: tls: internal error",
  "code": 500
}]
F0527 09:44:48.915837   11157 helpers.go:115] Error from server: error dialing backend: remote error: tls: internal error

更新 2

连接到其中一个 GKE 工作节点并检查kubelet日志后,我发现了这些有线线路

May 27 09:30:11 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:11.271022    1272 log.go:181] http: TLS handshake error from 10.156.0.9:54672: no serving certificate available for the kubelet
May 27 09:30:11 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:11.305628    1272 log.go:181] http: TLS handshake error from 10.156.0.9:54674: no serving certificate available for the kubelet
May 27 09:30:12 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:12.067998    1272 log.go:181] http: TLS handshake error from 10.156.0.11:57610: no serving certificate available for the kubelet
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.144826    1272 certificate_manager.go:412] Rotating certificates
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.154322    1272 reflector.go:207] Starting reflector *v1.CertificateSigningRequest (0s) from k8s.io/client-go/tools/watch/informerwatcher.go:146
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.448976    1272 reflector.go:213] Stopping reflector *v1.CertificateSigningRequest (0s) from k8s.io/client-go/tools/watch/informerwatcher.go:146
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: E0527 09:30:14.449045    1272 certificate_manager.go:454] certificate request was not signed: cannot watch on the certificate signing request: certificate signing request is denied, reason: AutoDenied, message:

更新 3

我已将集群版本从1.19.9-gke.1400更新到1.19.9-gke.1900 没有解决问题...

在集群上进行Credentials Rotation 但是也没有解决...

最终的

在集群中尝试大量更改后:

  • 在节点上重启 kubelet
  • 重启节点
  • 扩大/缩小节点池大小
  • 升级集群版本
  • 轮换集群证书

即使创建一个新集群(在同一个项目上,使用相同的 VPC 等)也没有解决问题......

此问题可能与对防火墙规则所做的更改有关。

仅找到解决方案,在新 GCP 项目中创建新 GKE 集群并使用 Velero 迁移工作流程。

假设这个答案可能有帮助,

*此问题是由于在每个节点中运行的 kubelet 对节点发出的未决证书签名请求引起的 *

检查节点的待定 CSR

kubectl get csr  --sort-by=.metadata.creationTimestamp

然后批准每个节点的 csr

kubectl certificate  approve  <csr-id>

更多详细信息,请参阅 k8s 文档中的此部分

https://kube.netes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/#kubelet-serving-certs

一个已知的限制是这些证书的 CSR(证书签名请求)不能被 kube-controller-manager - kube.netes.io/kubelet-serving 中的默认签名者自动批准。 这将需要用户或第三方采取行动 controller

.

请批准您的证书。 对于 kuber.nets/openshift。 openshift审批证书的解决方案

1- 检查 csr 并检查条件是否未决

$ oc get csr

2-批准证书

$ oc get csr -o name | xargs oc adm certificate approve

正如@Kevin所提到的,通过在一个新的 GCP 项目中创建一个新集群,然后使用 Velero 迁移工作流来解决问题。

也尝试将集群版本升级到V1.20.8-gke.900 ,因为这个问题在这个版本中消失了。

要了解有关集群类型和版本的更多信息,请查看此文档

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM