502 / 503 / 404 HTTP error : GKE ingress-nginx serving traffic to the wrong services, from other namespaces

Question

I have this kind of routing in each namespace:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    janitor/expires: ${EXPIRY_DATE}
    nginx.ingress.kubernetes.io/ssl-redirect: "false" # Set to true once SSL is set up.
spec:
  ingressClassName: nginx
  rules:
    - host: api.${KUBE_DEPLOY_HOST}
      http:
        paths:
        - pathType: Prefix
          path: /
          backend:
            service:
              name: api-js
              port:
                number: 111

Served by ingress-nginx (.= nginx-ingress) 1.2.1 (same issue with 1.5.1) with Kube 1.22 (or 1,23), one deployment in the ingress-nginx namespace. two replicas in the deployment.

When I check my logs I see that sometimes, I think especially when I deploy new ingress rules in new namespaces (during and after the ingress-nginx reload event) I get 502 / 503 / 404 HTTP error responses from the ingress-nginx controller.

When I look into the detailed log, I see:

IP - - [time] "GET API_ROUTE HTTP/1.1" 503 592 "master.frontend.url" UA 449 0.000 [development-branch-api] [] - - - - ID

Which makes me think the request goes wrong because the master frontend is being served a development API response by the ingress-nginx controller, sometimes when the new api service is not even ready.

When I check the ingress from GKE's view it looks like it is serving 3 pods, corresponding to 3 namespaces that should not overlap / mix requests, instead of the one api pod in the namespace corresponding to the ingress:

So the error is seen here, all the ingresses for each 3 namespsace serve 3 pods instead of one pod, which means it is all mixed up, right.

I am sure there is one pod per deployment in my namespaces:

So if I understand correctly, it seems that the situation is ingress A, ingress B and ingress C, all three of them, serve api A AND api B AND api C instead of serving just the one api pod from their namespace (A, B, C).

But what I don't know is how is it possible that the ingress matches pods from other namespaces, when I am not using externalname, it is the opposite of what an ingress does by default.

I believe the issue is at the ingress level and not at the service level, as when I look into each service, I see that it just serve the one pod corresponding to its namespace and not 3.

The controller is the default ingress-nginx installation edited to use 2 replicas instead of one.

Example service and deployment (issue happens for all of them):

apiVersion: v1
kind: Service
metadata:
  name: api-js
  labels:
    component: api-js
    role: api-js
  annotations:
    janitor/expires: ${EXPIRY_DATE}
spec:
  type: ClusterIP
  selector:
    role: perfmaker-api-js
  ports:
    - name: httpapi
      port: 111
      targetPort: 111
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-js
  annotations:
    janitor/expires: ${EXPIRY_DATE}
spec:
  replicas: 1
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app: api-js
  template:
    metadata:
      annotations:
        cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
      labels:
        app: api-js
        role: api-js
    spec:
      containers:
        - name: api-js
          image: registry/api

When I change the api name / selectors on one branch, it "untangles" the situation and each branch / namespace's ingress only serves the pod it should serve.

But the errors happen during and after 'reload' event on the ingress-controller, not all the time, an event which is fired when ingress resources are added / removed / updated. In my case it is when there is a new branch in the CI/CD which makes a new namespace and deployment + ingress, or when a finished pipeline triggers a namespace deletion.

Answer 1

Alas I must admit I just discovered the error does not originate from the kube.netes / ingress-nginx part of the setup but from the testing system, which includes a collision between services at deploy time, because of bad separation in the CI / CD job. Sorry for your time !

So in fact the logs from ingress nginx that stunned me:

IP - - [time] "GET API_ROUTE HTTP/1.1" 503 592 "master.frontend.url" UA 449 0.000 [development-branch-api] [] - - - - ID

Shows that a service I deploy is overwritten by another environment deployment with different variables, which makes it start to make request to another namespace. The ingress routing is correct.

502 / 503 / 404 HTTP error : GKE ingress-nginx serving traffic to the wrong services, from other namespaces

Question

1 answers

solution1
1 ACCPTED 2023-01-30 14:20:10

502 / 503 / 404 HTTP error : GKE ingress-nginx serving traffic to the wrong services, from other namespaces

Question

1 answers

solution1 1 ACCPTED 2023-01-30 14:20:10

solution1
1 ACCPTED 2023-01-30 14:20:10