[英]GCE Load Balancer Health Check Fails (Connection Refused)
My (GCE) Load Balancer health checks are failing with a connection refused
error, ultimately marking my GCE Ingress as UNHEALTHY
.我的 (GCE) Load Balancer 健康检查因
connection refused
错误而失败,最终将我的 GCE Ingress 标记为UNHEALTHY
。 Now I'm wondering how to fix this issue.现在我想知道如何解决这个问题。
For my setup I'm using a GKE Autopilot cluster.对于我的设置,我使用的是GKE Autopilot集群。 And I have teared down and restarted my setup several times, always leading to the same result.
而且我已经多次拆除并重新启动我的设置,总是导致相同的结果。
Suppose I have a deployment configured with a pod template consisting of several containers (of which not all of them expose ports).假设我的部署配置了一个由多个容器组成的 pod 模板(并非所有容器都公开端口)。
Side Note : For simplicity I skipped some configurations, such as config maps.旁注:为简单起见,我跳过了一些配置,例如配置映射。
apiVersion: apps/v1
kind: Deployment
metadata:
name: some-app-deployment
spec:
selector:
matchLabels:
app: some-app
replicas: 2
template:
metadata:
labels:
app: some-app
spec:
restartPolicy: Always
containers:
- name: web-server
image: {{some-app-image}}
command: ['app', 'web-server']
ports:
- name: web-server
containerPort: 5000
protocol: TCP
- name: admin-server
image: {{some-app-image}}
command: ['app', 'admin-server']
ports:
- name: admin-server
containerPort: 5001
protocol: TCP
- name: worker
image: {{some-app-image}}
command: ['app', 'worker']
- name: cron
image: {{some-app-image}}
command: ['app', 'cron']
- name: helper
image: {{some-app-image}}
command: [ '/bin/bash', '-c', '--' ]
args: [ 'while true; do sleep 30; done;' ]
The following is the BackendConfig CRD, which supposedly defines the health check.以下是 BackendConfig CRD,它应该定义了健康检查。 I chose to define the path as
/favicon.ico
because Load Balancer Health Check requires exactly 200 OK
response and the web-servers base path /
emits a redirect 302
, hence it would faild the Health Check.我选择将路径定义为
/favicon.ico
因为 Load Balancer Health Check 恰好需要200 OK
响应并且 Web 服务器基本路径/
发出重定向302
,因此它会失败 Health Check。 With kubectl port-forward
I confirmed that /favicon.ico
actually emits 200 OK
and it does.使用
kubectl port-forward
我确认/favicon.ico
实际上发出了200 OK
并且确实如此。 Btw.顺便提一句。 just to exclude this from being a problem, I also tried other paths without success.
只是为了排除这个问题,我也尝试了其他路径但没有成功。
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: http-hc-config
spec:
healthCheck:
type: HTTP
port: 5000
requestPath: "/favicon.ico"
checkIntervalSec: 20
Additionally there is a custom header added to the admin
endpoint.此外,还有一个自定义 header 添加到
admin
端点。
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: x-header-config
spec:
customRequestHeaders:
headers:
- "X-Client-Region:{client_region}"
The following is the service description.以下为服务说明。 It references the BackendConfig via annotations as per the documentation.
它根据文档通过注释引用 BackendConfig。 The documentation is not very specific as to how to reference the health check, so I mapped it to the relevant port.
文档中并没有具体说明如何引用健康检查,所以我将其映射到相关端口。
apiVersion: v1
kind: Service
metadata:
name: some-app-service
annotations:
cloud.google.com/backend-config: '{
"default":"http-hc-config",
"ports":{"4001":"x-header-config"}}'
spec:
type: NodePort
selector:
app: some-app
ports:
- name: web
targetPort: 5000
protocol: TCP
port: 4000
- name: admin
targetPort: 5001
protocol: TCP
port: 4001
I confirmed that my service is running, as I could successfully kubectl port-forward
the pods and access the web-servers content.我确认我的服务正在运行,因为我可以成功地
kubectl port-forward
pod 并访问网络服务器内容。
Now the final piece of the setup is this ingress object.现在设置的最后一部分是这个入口 object。
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress
annotations:
kubernetes.io/ingress.class: gce
spec:
rules:
- host: admin.example.com
http:
paths:
- path: /*
pathType: ImplementationSpecific
backend:
service:
name: some-app-service
port:
number: 4001
- http:
paths:
- path: /*
pathType: ImplementationSpecific
backend:
service:
name: some-app-service
port:
number: 4000
EDIT #1 & #2: Performing a gcloud compute health-checks describe {{HEALTH_DEF_ID}}
I receive the following output for the health check on port 4001
, which seems to be out-of-sync with the defined BackendConfig CRD:编辑 #1 和 #2:执行
gcloud compute health-checks describe {{HEALTH_DEF_ID}}
我收到以下 output port 4001
上的健康检查,它似乎与定义的 BackendConfig CRD 不同步:
checkIntervalSec: 15
creationTimestamp: '2022-02-07T11:30:50.059-08:00'
description: Default kubernetes L7 Loadbalancing health check for NEG.
healthyThreshold: 1
httpHealthCheck:
portSpecification: USE_SERVING_PORT
proxyHeader: NONE
requestPath: /
id: {{REDACTED}}
kind: compute#healthCheck
logConfig:
enable: true
name: k8s1-4622dadc-{{REDACTED}}
selfLink: https://www.googleapis.com/compute/v1/projects/{{REDACTED}}
timeoutSec: 15
type: HTTP
unhealthyThreshold: 2
And the following for port 4000
, which surprisingly contains the right path
and port
configuration:以下是
port 4000
的内容,它令人惊讶地包含正确的path
和port
配置:
checkIntervalSec: 20
creationTimestamp: '2022-02-07T12:12:59.248-08:00'
description: Default kubernetes L7 Loadbalancing health check for NEG.
healthyThreshold: 1
httpHealthCheck:
port: 5000
portSpecification: USE_FIXED_PORT
proxyHeader: NONE
requestPath: /favicon.ico
id: {{REDACTED}}
kind: compute#healthCheck
name: k8s1-4622dadc-{{REDACTED}}
selfLink: https://www.googleapis.com/compute/v1/projects/{{REDACTED}}
timeoutSec: 15
type: HTTP
unhealthyThreshold: 2
Either there's something wrong with my setup, or the BackendConfig is not equally applied for all services in the rules section of the Ingress.要么我的设置有问题,要么 BackendConfig 没有平等地应用于 Ingress 规则部分中的所有服务。
Edit #3:编辑#3:
The Healt Check log entry for port 4000
shows: port 4000
的健康检查日志条目显示:
healthCheckProbeResult: {
detailedHealthState: "TIMEOUT"
healthCheckProtocol: "HTTP"
healthState: "UNHEALTHY"
ipAddress: "10.40.129.203"
previousDetailedHealthState: "UNKNOWN"
previousHealthState: "UNHEALTHY"
probeCompletionTimestamp: "2022-02-07T20:06:10.955412154Z"
probeRequest: "/favicon.ico"
probeResultText: "HTTP response: , Error: Connection refused"
probeSourceIp: "35.191.12.114"
responseLatency: "0.000569s"
targetIp: "10.40.129.203"
targetPort: 5000
}
EDIT #4: I needed to adjust the BackendConfig annotation, as my case was in fact more involved compared to what my previous definitions were showing.编辑 #4:我需要调整 BackendConfig 注释,因为与我之前的定义所显示的相比,我的案例实际上涉及更多。
The problem was in the way I applied the BackendConfig annotation of the service.问题在于我应用服务的 BackendConfig 注释的方式。 My assumption was that the
"default"
config would be applied to both ports and the extra header config would additionally be applied to port 4001.我的假设是
"default"
配置将应用于两个端口,额外的 header 配置将另外应用于端口 4001。
But this is not the case.但这种情况并非如此。
In case a BackendConfig deviates from the default, you'll have to define a one for each case where it deviates from the HTTP ports.如果 BackendConfig 偏离默认值,您必须为每种偏离 HTTP 端口的情况定义一个。 BackendConfigs will not get merged and you can not add multiple configs per port.
BackendConfigs 不会合并,您不能为每个端口添加多个配置。
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: config-4000
spec:
healthCheck:
type: HTTP
port: 5000
requestPath: "/favicon.ico"
checkIntervalSec: 20
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: config-4001
spec:
healthCheck:
type: HTTP
port: 5000
requestPath: "/favicon.ico"
checkIntervalSec: 20
headers:
- "X-Client-Region:{client_region}"
---
apiVersion: v1
kind: Service
metadata:
name: some-app-service
annotations:
cloud.google.com/backend-config: '{"ports":{
"4000":"config-4000",
"4001":"config-4001"
}}'
spec:
type: NodePort
selector:
app: some-app
ports:
- name: web
targetPort: 5000
protocol: TCP
port: 4000
- name: admin
targetPort: 5001
protocol: TCP
port: 4001
EDIT #1:编辑#1:
An additional issue (resulting in the Connection Error) was in my project config:另一个问题(导致连接错误)出现在我的项目配置中:
127.0.0.1
instead of 0.0.0.0
127.0.0.1
而不是0.0.0.0
Can you do a kubectl describe ing ingress
?你能做一个
kubectl describe ing ingress
吗?
I think you should see an error about an invalid wildcard.我认为您应该看到有关无效通配符的错误。 You should leave the
*
out.你应该离开
*
了。
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress
annotations:
kubernetes.io/ingress.class: gce
spec:
rules:
- host: admin.example.com
http:
paths:
- path: /
pathType: ImplementationSpecific
backend:
service:
name: some-app-service
port:
number: 4001
- http:
paths:
- path: /
pathType: ImplementationSpecific
backend:
service:
name: some-app-service
port:
number: 4000
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.