Kubernetes - nginx-ingress 在通过 php 上传文件后崩溃

Question

I'am running Kubernetes cluster on Google Cloud Platform via their Kubernetes Engine.我通过他们的 Kubernetes Engine 在谷歌云平台上运行 Kubernetes 集群。 Cluster version is 1.13.11-gke.14.集群版本为 1.13.11-gke.14。 PHP application pod contains 2 containers - Nginx as a reverse proxy and php-fpm (7.2). PHP 应用程序 pod 包含 2 个容器 - 作为反向代理的 Nginx 和 php-fpm (7.2)。

In google cloud is used TCP Load Balancer and then internal routing via Nginx Ingress.在谷歌云中使用 TCP 负载均衡器，然后通过 Nginx Ingress 进行内部路由。

Problem is: when I upload some bigger file (17MB), ingress is crashing with this error:问题是：当我上传一些更大的文件（17MB）时，入口崩溃并出现此错误：

W 2019-12-01T14:26:06.341588Z Dynamic reconfiguration failed: Post http+unix://nginx-status/configuration/backends: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory 
E 2019-12-01T14:26:06.341658Z Unexpected failure reconfiguring NGINX: 
W 2019-12-01T14:26:06.345575Z requeuing initial-sync, err Post http+unix://nginx-status/configuration/backends: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory 
I 2019-12-01T14:26:06.354869Z Configuration changes detected, backend reload required. 
E 2019-12-01T14:26:06.393528796Z Post http+unix://nginx-status/configuration/backends: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory

E 2019-12-01T14:26:08.077580Z healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused 
I 2019-12-01T14:26:12.314526990Z 10.132.0.25 - [10.132.0.25] - - [01/Dec/2019:14:26:12 +0000] "GET / HTTP/2.0" 200 541 "-" "GoogleStackdriverMonitoring-UptimeChecks(https://cloud.google.com/monitoring)" 99 1.787 [bap-staging-bap-staging-80] [] 10.102.2.4:80 553 1.788 200 5ac9d438e5ca31618386b35f67e2033b

E 2019-12-01T14:26:12.455236Z healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused 
I 2019-12-01T14:26:13.156963Z Exiting with 0

Here is yaml configuration of Nginx ingress.这是 Nginx ingress 的 yaml 配置。 Configuration is default by Gitlab's system that is creating cluster on their own. Gitlab 的系统默认配置是自行创建集群的。

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "2"
  creationTimestamp: "2019-11-24T17:35:04Z"
  generation: 3
  labels:
    app: nginx-ingress
    chart: nginx-ingress-1.22.1
    component: controller
    heritage: Tiller
    release: ingress
  name: ingress-nginx-ingress-controller
  namespace: gitlab-managed-apps
  resourceVersion: "2638973"
  selfLink: /apis/apps/v1/namespaces/gitlab-managed-apps/deployments/ingress-nginx-ingress-controller
  uid: bfb695c2-0ee0-11ea-a36a-42010a84009f
spec:
  progressDeadlineSeconds: 600
  replicas: 2
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: nginx-ingress
      release: ingress
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      annotations:
        prometheus.io/port: "10254"
        prometheus.io/scrape: "true"
      creationTimestamp: null
      labels:
        app: nginx-ingress
        component: controller
        release: ingress
    spec:
      containers:
      - args:
        - /nginx-ingress-controller
        - --default-backend-service=gitlab-managed-apps/ingress-nginx-ingress-default-backend
        - --election-id=ingress-controller-leader
        - --ingress-class=nginx
        - --configmap=gitlab-managed-apps/ingress-nginx-ingress-controller
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.1
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /healthz
            port: 10254
            scheme: HTTP
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 3
        name: nginx-ingress-controller
        ports:
        - containerPort: 80
          name: http
          protocol: TCP
        - containerPort: 443
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /healthz
            port: 10254
            scheme: HTTP
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 3
        resources: {}
        securityContext:
          allowPrivilegeEscalation: true
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - ALL
          runAsUser: 33
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/nginx/modsecurity/modsecurity.conf
          name: modsecurity-template-volume
          subPath: modsecurity.conf
        - mountPath: /var/log/modsec
          name: modsecurity-log-volume
      - args:
        - /bin/sh
        - -c
        - tail -f /var/log/modsec/audit.log
        image: busybox
        imagePullPolicy: Always
        name: modsecurity-log
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/log/modsec
          name: modsecurity-log-volume
          readOnly: true
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: ingress-nginx-ingress
      serviceAccountName: ingress-nginx-ingress
      terminationGracePeriodSeconds: 60
      volumes:
      - configMap:
          defaultMode: 420
          items:
          - key: modsecurity.conf
            path: modsecurity.conf
          name: ingress-nginx-ingress-controller
        name: modsecurity-template-volume
      - emptyDir: {}
        name: modsecurity-log-volume

I have no Idea what else to try.我不知道还能尝试什么。 I'm running cluster on 3 nodes (2x 1vCPU, 1.5GB RAM and 1x Preemptile 2vCPU, 1,8GB RAM), all of them on SSD drives.我在 3 个节点（2x 1vCPU、1.5GB RAM 和 1x Preemptile 2vCPU、1.8GB RAM）上运行集群，它们都在 SSD 驱动器上。

Anytime i upload the image, disk IO will get crazy.每当我上传图像时，磁盘 IO 都会变得疯狂。

Disk IOPS Disk I/O Thanks for your help.磁盘 IOPS 磁盘 I/O感谢您的帮助。

Answer 1

Found solution.找到解决方案。 Nginx-ingress pod contained modsecurity too. Nginx-ingress pod 也包含 modsecurity。 All requests were analyzed by mod security and bigger uploaded files caused those crashes.所有请求都由 mod security 分析，更大的上传文件导致了这些崩溃。 It wasn't crash at all but took too much CPU and I/O, that caused longer healthcheck response to all other pods.它根本没有崩溃，但占用了太多的 CPU 和 I/O，导致对所有其他 pod 的健康检查响应时间更长。 Solution is to configure correctly modsecurity or disable.解决方案是正确配置 modsecurity 或禁用。

Kubernetes - nginx-ingress 在通过 php 上传文件后崩溃

问题描述

1 个解决方案

解决方案1
0 2020-02-27 10:05:31

Kubernetes - nginx-ingress 在通过 php 上传文件后崩溃

问题描述

1 个解决方案

解决方案1 0 2020-02-27 10:05:31

解决方案1
0 2020-02-27 10:05:31