简体   繁体   中英

How to control vhost_shared_traffic memory K8s nginx ingress?

Background

We run a kubernetes cluster that handles several php/lumen microservices. We started seeing the app php-fpm/nginx reporting 499 status code in it's logs, and it seems to correspond with the client getting a blank response (curl returns curl: (52) Empty reply from server ) while the applications log 499.

10.10.x.x - - [09/Mar/2020:18:26:46 +0000] "POST /some/path/ HTTP/1.1" 499 0 "-" "curl/7.65.3"

My understanding is nginx will return the 499 code when the client socket is no longer open/available to return the content to. In this situation that appears to mean something before the nginx/application layer is terminating this connection. Our configuration currently is:

ELB -> k8s nginx ingress -> application

So my thoughts are either ELB or ingress since the application is the one who has no socket left to return to. So i started hitting ingress logs...

Potential core problem?

While looking the the ingress logs i'm seeing quite a few of these:

2020/03/06 17:40:01 [crit] 11006#11006: ngx_slab_alloc() failed: no memory in vhost_traffic_status_zone "vhost_traffic_status"

Potential Solution

I imagine if i gave vhost_traffic_status_zone some more memory at least that error would go away and on to finding the next error.. but I can't seem to find any configmap value or annotation that would allow me to control this. I've checked the docs:

https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/

https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/

Thanks in advance for any insight / suggestions / documentation I might be missing!

here is the standard way to look up how to modify the nginx.conf in the ingress controller. After that, I'll link in some info on suggestions on how much memory you should give the zone.

First start by getting the ingress controller version by checking the image version on the deploy kubectl -n <namespace> get deployment <deployment-name> | grep 'image:' kubectl -n <namespace> get deployment <deployment-name> | grep 'image:'

From there, you can retrieve the code for your version from the following URL. In the following, I will be using version 0.10.2. https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.10.2

The nginx.conf template can be found at rootfs/etc/nginx/template/nginx.tmpl in the code or /etc/nginx/template/nginx.tmpl on a pod. This can be grepped for the line of interest. I the example case, we find the following line in the nginx.tmpl

vhost_traffic_status_zone shared:vhost_traffic_status:{{ $cfg.VtsStatusZoneSize }};

This gives us the config variable to look up in the code. Our next grep for VtsStatusZoneSize leads us to the lines in internal/ingress/controller/config/config.go

    // Description: Sets parameters for a shared memory zone that will keep states for various keys. The cache is shared between all worker processe
    // https://github.com/vozlt/nginx-module-vts#vhost_traffic_status_zone
    // Default value is 10m
    VtsStatusZoneSize string `json:"vts-status-zone-size,omitempty"

This gives us the key "vts-status-zone-size" to be added to the configmap "ingress-nginx-ingress-controller". The current value can be found in the rendered nginx.conf template on a pod at /etc/nginx/nginx.conf.

When it comes to what size you may want to set the zone, there are the docs here that suggest setting it to 2*usedSize:

If the message("ngx_slab_alloc() failed: no memory in vhost_traffic_status_zone") printed in error_log, increase to more than (usedSize * 2).

https://github.com/vozlt/nginx-module-vts#vhost_traffic_status_zone

"usedSize" can be found by hitting the stats page for nginx or through the JSON endpoint. Here is the request to get the JSON version of the stats and if you have jq the path to the value: curl http://localhost:18080/nginx_status/format/json 2> /dev/null | jq .sharedZones.usedSize curl http://localhost:18080/nginx_status/format/json 2> /dev/null | jq .sharedZones.usedSize

Hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM