简体   繁体   中英

Kubernetes: stop CloudSQL-proxy sidecar container in multi container Pod/Job

I have a Kube.netes JOB that does database migrations on a CloudSQL database.
One way to access the CloudSQL database from GKE is to use the CloudSQL-proxy container and then connect via localhost . Great - that's working so far. But because I'm doing this inside a K8s JOB the job is not marked as successfully finished because the proxy keeps on running.

$ kubectrl get po
NAME                      READY     STATUS      RESTARTS   AGE
db-migrations-c1a547      1/2       Completed   0          1m

Even though the output says 'completed' one of the initially two containers is still running - the proxy.

How can I make the proxy exit on completing the migrations inside container 1?

The best way I have found is to share the process namespace between containers and use the SYS_PTRACE securityContext capability to allow you to kill the sidecar.

apiVersion: batch/v1
kind: Job
metadata:
  name: my-db-job
spec:
  template:
    spec:
      restartPolicy: OnFailure
      shareProcessNamespace: true
      containers:
      - name: my-db-job-migrations
        command: ["/bin/sh", "-c"]
        args:
          - |
            <your migration commands>;
            sql_proxy_pid=$(pgrep cloud_sql_proxy) && kill -INT $sql_proxy_pid;
        securityContext:
          capabilities:
            add:
              - SYS_PTRACE
      - name: cloudsql-proxy
        image: gcr.io/cloudsql-docker/gce-proxy:1.17
        command:
          - "/cloud_sql_proxy"
        args:
          - "-instances=$(DB_CONNECTION_NAME)=tcp:5432"
          

One possible solution would be a separate cloudsql-proxy deployment with a matching service. You would then only need your migration container inside the job that connects to your proxy service.

This comes with some downsides:

  • higher network latency, no pod local mysql communication
  • possible security issue if you provide the sql port to your whole kubernetes cluster

If you want to open cloudsql-proxy to the whole cluster you have to replace tcp:3306 with tcp:0.0.0.0:3306 in the -instance parameter on the cloudsql-proxy.

There are 3 ways of doing this.

1- Use private IP to connect your K8s job to Cloud SQL, as described by @newoxo in one of the answers. To do that, your cluster needs to be a VPC-native cluster. Mine wasn't and I was not whiling to move all my stuff to a new cluster. So I wasn't able to do this.

2- Put the Cloud SQL Proxy container in a separate deployment with a service, as described by @Christian Kohler. This looks like a good approach, but it is not recommended by Google Cloud Support.

I was about to head in this direction (solution #2) but I decided to try something else.

And here is the solution that worked for me:

3- You can communicate between different containers in the same Pod/Job using the file system. The idea is to tell the Cloud SQL Proxy container when the main job is done, and then kill the cloud sql proxy. Here is how to do it:

In the yaml file (my-job.yaml)

apiVersion: v1
kind: Pod
metadata:
  name: my-job-pod
  labels:
    app: my-job-app
spec:
  restartPolicy: OnFailure
  containers:
  - name: my-job-app-container
    image: my-job-image:0.1
    command: ["/bin/bash", "-c"]
    args:
      - |
        trap "touch /lifecycle/main-terminated" EXIT
        { your job commands here }
    volumeMounts:
      - name: lifecycle
        mountPath: /lifecycle
  - name: cloudsql-proxy-container
    image: gcr.io/cloudsql-docker/gce-proxy:1.11
    command: ["/bin/sh", "-c"]
    args:
      - |
        /cloud_sql_proxy -instances={ your instance name }=tcp:3306 -credential_file=/secrets/cloudsql/credentials.json &
        PID=$!
        while true
            do
                if [[ -f "/lifecycle/main-terminated" ]] 
                then
                    kill $PID
                    exit 0
                fi
                sleep 1
            done
    securityContext:
      runAsUser: 2  # non-root user
      allowPrivilegeEscalation: false
    volumeMounts:
      - name: cloudsql-instance-credentials
        mountPath: /secrets/cloudsql
        readOnly: true
      - name: lifecycle
        mountPath: /lifecycle
  volumes:
  - name: cloudsql-instance-credentials
    secret:
      secretName: cloudsql-instance-credentials
  - name: lifecycle
    emptyDir:

Basically, when your main job is done, it will create a file in /lifecycle that will be identified by the watcher added to the cloud-sql-proxy container, which will kill the proxy and terminate the container.

I hope it helps! Let me know if you have any questions.

Based on: https://stackoverflow.com/a/52156131/7747292

Doesn't look like Kubernetes can do this alone, you would need to manually kill the proxy once the migration exits. Similar question asked here: Sidecar containers in Kubernetes Jobs?

Google cloud sql has recently launched private ip address connectivity for cloudsql. If the cloud sql instance and kubernetes cluster is in same region you can connect to cloudsql without using cloud sql proxy.

https://cloud.google.com/sql/docs/mysql/connect-kubernetes-engine#private-ip

A possible solution would be to set the concurrencyPolicy: Replace in the job spec ... this will agnostically replace the current pod with the new instance whenever it needs to run again. But, you have to make sure that the subsequent cron runs are separated enough.

Unfortunately the other answers weren't working for me because of CloudSQLProxy running in a distroless environment where there is no shell .

I managed to get around this by bundling aCloudSQLProxy binary with my deployment and running a bash script to start up CloudSQLProxy followed by my app.

Dockerfile:

FROM golang:1.19.4

RUN apt update
COPY . /etc/mycode/
WORKDIR /etc/mycode
RUN chmod u+x ./scripts/run_migrations.sh
RUN chmod u+x ./bin/cloud_sql_proxy.linux-amd64

RUN go install
ENTRYPOINT ["./scripts/run_migrations.sh"]

Shell Script (run_migrations.sh):

#!/bin/sh

# This script is run from the parent directory
dbConnectionString=$1
cloudSQLProxyPort=$2

echo "Starting Cloud SQL Proxy"
./bin/cloud_sql_proxy.linux-amd64 -instances=${dbConnectionString}=tcp:5432 -enable_iam_login -structured_logs &
CHILD_PID=$!
echo "CloudSQLProxy PID: $CHILD_PID"

echo "Migrating DB..."
go run ./db/migrations/main.go
MAIN_EXIT_CODE=$?

kill $CHILD_PID;
echo "Migrations complete.";

exit $MAIN_EXIT_CODE

K8s (via Pulumi):

import * as k8s from '@pulumi/kubernetes'

const jobDBMigrations = new k8s.batch.v1.Job("job-db-migrations", {
      metadata: {
        namespace: namespaceName,
        labels: appLabels,
      },
      spec: {
        backoffLimit: 4,
        template: {
          spec: {
            containers: [
              {
                image: pulumi.interpolate`gcr.io/${gcpProject}/${migrationsId}:${migrationsVersion}`,
                name: "server-db-migration",
                args: [
                  dbConnectionString,
                ],
              },
            ],
            restartPolicy: "Never",
            serviceAccount: k8sSAMigration.metadata.name,
          },
        },
      },
    },
    {
      provider: clusterProvider,
    });

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM