简体   繁体   中英

Generate, store and clean data across multiple process

I have a simple application which stores it's runtime data at say /tmp of host machine. This data is used by different workers of same application.

When ever I start my app I want to make sure no data from previous session is stored there so I have to clean up that data.

But when I am deploying this on kubernetes if a pod is killed and re-starts the data is cleared. How can we avoid this.

# Sample of how things are getting done
class Work:
    def __init__(self):
        # Code Remove everything inside /tmp directory

    def work(self):
        # Generate userdata in /tmp directory

This app is hosted by flask using gunicorn . This works perfectly when we run it in simple environment.

But when we run it on kubernetes if pod gets killed new pod is created and this will remove existing data from /tmp directory. Which resets the information we've collected till now.

This looks like a general problem people must have encountered in time. Please suggest me some existing methodology for this.

If you are storing it on the Host machine, which means mounting the file system using the HostPath then even if POD is restarting data should be there.

You might be writing files or data to the POD file system so if POD restart it will be empty if i understood it clearly.

In this case, you can use the HostPath which will mount the POD file system to Node's(Host) machine. Also, keep the POD name as a prefix

so your file system will be /tmp/<POD Name>/data-here

To get the POD name you can use the Downward API , i think with this case there won't be any changes required on the APP side minor changes are required in YAML only.

Below an example, you can notice the subPathExpr: $(pod_name) and Env to get the POD name.

Data will be written at /var/log/app-data/<POD-Name>/

Example

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "3"
  labels:
    app.kubernetes.io/component: app
  name: app
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/instance: app
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app.kubernetes.io/component: app
    spec:
      containers:
      - env:
        - name: pod_name
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        image: <IMAGE>
        name: app
        ports:
        - containerPort: 443
          name: app-tomcat-port
          protocol: TCP
        volumeMounts:
        - mountPath: /tmp
          name: app-data
          subPathExpr: $(pod_name)
      volumes:
      - hostPath:
          path: /var/log/app_data
          type: DirectoryOrCreate
        name: app-data

Read more about the hostpath : https://kubernetes.io/docs/concepts/storage/volumes/#hostpath

Subpath expression doc: https://kubernetes.io/docs/concepts/storage/volumes/#using-subpath-expanded-environment

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM