简体   繁体   中英

AWS deployment with kubernetes 1.7.2 continuously running in pod getting killed and restarted

We have got into an issue with our AWS deployment with kubernetes/helm where we are seeing "Pod sandbox changed, it will be killed and re-created". This was not happening before but got started with our latest deployment where we deleted previous deployment with helm delete and created new one with helm install. Not sure if this is related with our new dependency on AWS SQS or updating of kubertetes/helm/kops versions. There are other pods on the same kubernetes node and they are working fine.

These pods keep on getting killed and restarted with following messages repeating:

  • Pod sandbox changed, it will be killed and re-created
  • Killing container with id docker://xxx:Need to kill Pod
  • Back-off restarting failed container
  • Error syncing pod

Manually killing the pod does bring up new pod as k8s would but that doesn't fix the issues as mentioned by some in related threads.

values for cpu and memory

resources: limits: cpu: 100m memory: 128Mi requests: cpu: 100m memory: 128Mi

Version info:

- client version 1.9 (also tried 1.6 and 1.7)
- server version 1.7 (git vresion 1.7.2)
- helm vresion 2.7.2
- kops version 1.8.0
- Kernel Version: 4.4.102-k8s
- OS Image: Debian GNU/Linux 8 (jessie)
- Container Runtime Version: docker://1.12.6
- Kubelet Version: v1.7.2
- Kube-Proxy Version: v1.7.2
- Operating system: linux
- Architecture: amd64

Have already gone through all relevant threads for this error but this issue seemed to be for different environment and versions listed in those threads are not used by us.

- https://stackoverflow.com/questions/46826164/kubernetes-pods-failing-on-pod-sandbox-changed-it-will-be-killed-and-re-create
- https://stackoverflow.com/questions/46922452/kubernetes-1-7-on-google-cloud-failedsync-error-syncing-pod-sandboxchanged-pod

Any pointers on finding root cause or fixing the issue would be very helpful. Thanks a lot.

The fix turned out to be increasing the limits for memory. We changed values.yaml file (following section) used by helm and bumped up the limits...

resources:

limits:
  cpu: 100m
  memory: 128Mi <--- increased this value...
requests:
  cpu: 100m
  memory: 128Mi

Wish the error message showing up was more specific than "Pod sandbox changed, it will be killed and re-created" :-)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM