[英]Connection Refused between Kubernetes pods in the same cluster
I am new to Kubernetes and I'm working on deploying an application within a new Kubernetes cluster.我是 Kubernetes 的新手,我正在努力在新的 Kubernetes 集群中部署应用程序。
Currently, the service running has multiple pods that need to communicate with each other.目前,运行的服务有多个需要相互通信的 pod。 I'm looking for a general approach to go about debugging the issue, rather than getting into the specifies of the service as the question will become much too specific.我正在寻找关于调试问题的 go 的一般方法,而不是进入服务的指定,因为问题将变得过于具体。
The pods within the cluster are throwing an error: err="Get \"http://testpod.mynamespace.svc.cluster.local:8080/": dial tcp 10.10.80.100:8080: connect: connection refused"
Both pods are in the same cluster.集群中的 pod 抛出错误: err="Get \"http://testpod.mynamespace.svc.cluster.local:8080/": dial tcp 10.10.80.100:8080: connect: connection refused"
两个 pod 都是在同一个集群中。
What are the best steps to take to debug this?调试此问题的最佳步骤是什么?
I have tried running: kubectl exec -it testpod --namespace mynamespace -- cat /etc/resolv.conf
And this returns: search mynamespace.svc.cluster.local svc.cluster.local cluster.local us-east-2.compute.internal
Which I found here: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/我试过运行: kubectl exec -it testpod --namespace mynamespace -- cat /etc/resolv.conf
这会返回: search mynamespace.svc.cluster.local svc.cluster.local cluster.local us-east-2.compute.internal
我在这里找到的: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/
First of all, the following pattern:首先,以下模式:
my-svc.my-namespace.svc.cluster-domain.example
is applicable only to FQDNs of Services , not Pods which have the following form:仅适用于Services 的 FQDN ,不适用于具有以下形式的Pod :
pod-ip-address.my-namespace.pod.cluster-domain.example
eg:例如:
172-17-0-3.default.pod.cluster.local
So in fact you're querying cluster dns about FQDN of the Service
named testpod
and not about FQDN of the Pod
.所以实际上你正在查询集群 dns 关于名为testpod
的Service
的 FQDN 而不是关于Pod
的 FQDN。 Judging by the fact that it's being resolved successfully, such Service
already exists in your cluster but most probably is misconfigured.从它被成功解决的事实来看,这样的Service
已经存在于您的集群中,但很可能是配置错误。 The fact that you're getting the error message connection refused
can mean the following:您收到错误消息connection refused
的事实可能意味着以下内容:
Service
FQDN testpod.mynamespace.svc.cluster.local
has been successfully resolved (otherwise you would receive something like curl: (6) Could not resolve host: testpod.default.svc.cluster.local
)您的Service
FQDN testpod.mynamespace.svc.cluster.local
已成功解析(否则您会收到类似curl: (6) Could not resolve host: testpod.default.svc.cluster.local
)testpod
Service
(otherwise, ie if it existed but wasn't listening on 8080
port, you're trying to connect to, you would receive timeout
eg curl: (7) Failed to connect to testpod.default.svc.cluster.local port 8080: Connection timed out
)您已成功访问您的testpod
Service
(否则,如果它存在但未在8080
端口上侦听,则您正在尝试连接,您将收到timeout
,例如curl: (7) Failed to connect to testpod.default.svc.cluster.local port 8080: Connection timed out
)Pod
, exposed by testpod
Service
(you've been sussessfully redirected to it by the testpod
Service
)你已经到达了由testpod
Service
暴露的Pod
(你已经被testpod
Service
重定向到它了)Pod
, you're trying to connect to incorect port and that's why the connection is being refused by the server但是一旦到达Pod
,您就会尝试连接到错误的端口,这就是服务器拒绝连接的原因My best guess is that your Pod
in fact listens on different port, like 80
but you exposed it via the ClusterIP
Service
by specifying only --port
value eg by:我最好的猜测是,您的Pod
实际上侦听不同的端口,例如80
,但是您通过ClusterIP
Service
通过仅指定--port
值来公开它,例如:
kubectl expose pod testpod --port=8080
In such case both --port
(port of the Service
) and --targetPort
(port of the Pod
) will have the same value.在这种情况下, --port
( Service
的端口)和--targetPort
( Pod
的端口)都将具有相同的值。 In other words you've created a Service
like the one below:换句话说,您已经创建了如下所示的Service
:
apiVersion: v1
kind: Service
metadata:
name: testpod
spec:
ports:
- protocol: TCP
port: 8080
targetPort: 8080
And you probably should've exposed it either this way:而且您可能应该以这种方式公开它:
kubectl expose pod testpod --port=8080 --targetPort=80
or with the following yaml manifest:或使用以下 yaml 清单:
apiVersion: v1
kind: Service
metadata:
name: testpod
spec:
ports:
- protocol: TCP
port: 8080
targetPort: 80
Of course your targetPort
may be different than 80
, but connection refused
in such case can mean only one thing: target http server (running in a Pod
) refuses connection to 8080
port (most probably because it isn't listening on it).当然,您的targetPort
可能与80
不同,但在这种情况下connection refused
仅意味着一件事:目标 http 服务器(在Pod
中运行)拒绝连接到8080
端口(很可能是因为它没有在监听它)。 You didn't specify what image you're using, whether it's a standard nginx
webserver or something based on your custom image.您没有指定您使用的是什么图像,无论是标准的nginx
网络服务器还是基于您的自定义图像的东西。 But if it's nginx
and wasn't configured differently it listens on port 80
.但是,如果它是nginx
并且没有进行不同的配置,它会在端口80
上进行侦听。
For further debug, you can attach to your Pod
:为了进一步调试,您可以附加到您的Pod
:
kubectl exec -it testpod --namespace mynamespace -- /bin/sh
and if netstat
command is not present (the most likely scenario) run:如果netstat
命令不存在(最有可能的情况)运行:
apt update && apt install net-tools
and then check with netstat -ntlp
on which port your container listens on.然后使用netstat -ntlp
检查您的容器侦听的端口。
I hope this helps you solve your issue.我希望这可以帮助您解决问题。 In case of any doubts, don't hesitate to ask.如有任何疑问,请随时提问。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.