Google云端：运行状况检查不会从UDP内部负载均衡器中删除失败的实例

Question

I'm working in a project to move our SIP infra. 我正在一个项目中移动我们的SIP。 to GCP. 转到GCP。

I'm using an UDP Internal load balancer with a private IP to route calls from Asterisk to my Kamailio SBC, Asterisk is configured with the address IP of the load balancer as a single outgoing endpoint. 我正在使用带有专用IP的UDP内部负载均衡器将呼叫从Asterisk路由到我的Kamailio SBC，Asterisk配置了负载均衡器的地址IP作为单个传出端点。

my Internal UDP Load Balancer operate on 5060 Frontend, a Backend with 2 SBC and basic http Health Check on port 80. 我的内部UDP负载平衡器在5060前端，具有2个SBC的后端和端口80上的基本http运行状况检查上运行。

On each kamailio SBC I have my application listing on port 5060 and apache server on port 80 for health check so stopping httpd change the status of an instance to unhealthy. 在每个kamailio SBC上，我在端口5060上都有我的应用程序列表，在端口80上有apache服务器，以进行运行状况检查，因此停止httpd会将实例的状态更改为不正常。

forwarding-rules 转发规则

# gcloud compute forwarding-rules describe ip-gateway-internal-lb-local-fontend --region=europe-west3
IPAddress: 10.156.0.15
IPProtocol: UDP
backendService: https://www.googleapis.com/compute/v1/projects/My-Project/regions/europe-west3/backendServices/My-gateway-internal-lb-bservices
creationTimestamp: '2018-01-30T10:20:19.564-08:00'
description: ''
id: 'XXXXXXXXXXXXX'
kind: compute#forwardingRule
loadBalancingScheme: INTERNAL
name: ip-gateway-internal-lb-local-fontend
network: https://www.googleapis.com/compute/v1/projects/My-Project/global/networks/default
ports:
- '5060'
region: https://www.googleapis.com/compute/v1/projects/My-Project/regions/europe-west3
selfLink: https://www.googleapis.com/compute/v1/projects/My-Project/regions/europe-west3/forwardingRules/ip-gateway-internal-lb-local-fontend
subnetwork: https://www.googleapis.com/compute/v1/projects/My-Project/regions/europe-west3/subnetworks/default

backend-service 后端服务

# gcloud compute backend-services describe My-gateway-internal-lb-bservices --region=europe-west3
backends:
- balancingMode: CONNECTION
  description: ''
  group: https://www.googleapis.com/compute/v1/projects/My-Project/zones/europe-west3-a/instanceGroups/My-gateway-1xx
connectionDraining:
  drainingTimeoutSec: 0
creationTimestamp: '2018-01-30T10:15:10.688-08:00'
description: ''
fingerprint: XXXXXXXXX
healthChecks:
- https://www.googleapis.com/compute/v1/projects/My-Project/global/healthChecks/basic-check-internal-http
id: 'XXXXXXXXX'
kind: compute#backendService
loadBalancingScheme: INTERNAL
name: My-gateway-internal-lb-bservices
protocol: UDP
region: https://www.googleapis.com/compute/v1/projects/My-Project/regions/europe-west3
selfLink: https://www.googleapis.com/compute/v1/projects/My-Project/regions/europe-west3/backendServices/My-gateway-internal-lb-bservices
sessionAffinity: NONE
timeoutSec: 3

health-check 健康检查

# gcloud compute health-checks describe basic-check-internal-http
checkIntervalSec: 3
creationTimestamp: '2018-01-31T01:13:25.030-08:00'
description: ''
healthyThreshold: 2
httpHealthCheck:
  host: ''
  port: 80
  proxyHeader: NONE
  requestPath: /
id: 'XXXXXXXXXXXXXXXXXXXX'
kind: compute#healthCheck
name: basic-check-internal-http
selfLink: https://www.googleapis.com/compute/v1/projects/My-Project/global/healthChecks/basic-check-internal-http
timeoutSec: 3
type: HTTP
unhealthyThreshold: 2

All timeout are set to 3s, Internal UDP LB route rules done by the session affinity (the persistence) are not removed immediately, it takes about 15 min (without any traffic) to be removed. 所有超时均设置为3s，不会立即删除由会话亲缘关系（持久性）完成的内部UDP LB路由规则，大约需要15分钟（没有任何流量）被删除。

Same case when an instance is healthy again, it takes 15 min to be considered by the LB and start receiving traffic. 同样，当实例再次恢复正常时，LB需要15分钟的时间才能开始接收流量。

I didn't had this problem when I was using an UDP load balancer with an External address IP, because my asterisk address sending the traffic are nated so the 5-tuple hash will be different for each call. 当我使用带有外部地址IP的UDP负载平衡器时，我没有遇到这个问题，因为发送流量的星号地址已设置为零，因此每个调用的5元组哈希值都不同。

But with an UDP LB using an Internal IP the 5-tuple hash will be always the same (same src/dst IP:Port) so how I can configure the timeout of the session affinity (persistence) rules or force flushing the memory of my LB. 但是对于使用内部IP的UDP LB，五元组哈希将始终相同（相同的src / dst IP：Port），因此如何配置会话亲缘关系（持久性）规则的超时时间或强制刷新我的内存磅。

Maybe it's a Bug ! 也许是个虫子！ Has anyone run into the same problem? 有人遇到过同样的问题吗？ Thanks and looking forward if any someone can help me out with this issue ? 谢谢，期待有人能帮助我解决这个问题吗？

BR, Ouss BR，奥斯

Answer 1

It is a bug on the Internal load balancer (UDP) 这是内部负载平衡器（UDP）上的错误

https://groups.google.com/forum/#!topic/gce-discussion/1uYmNoLgdGw https://groups.google.com/forum/#!topic/gce-discussion/1uYmNoLgdGw

https://issuetracker.google.com/issues/72491707 https://issuetracker.google.com/issues/72491707

Google云端：运行状况检查不会从UDP内部负载均衡器中删除失败的实例

问题描述

1 个解决方案

解决方案1
1 2018-02-02 09:01:58

Google云端：运行状况检查不会从UDP内部负载均衡器中删除失败的实例

问题描述

1 个解决方案

解决方案1 1 2018-02-02 09:01:58

解决方案1
1 2018-02-02 09:01:58