简体   繁体   中英

Copy-on-write style memory reuse for kubernetes pods? To make pod spawn faster and more memory efficient

  • Can Kubernetes pods share significant amount of memory?

  • Does copy-on-write style forking exist for pods?

The purpose is to make pods spawn faster and use less memory.

Our scenario is that we have a dedicated game server to host in kubernetes. The problem is that one instance of the dedicated game server would take up a few GB of memory upfront (eg 3 GBs).

Also, we have a few such docker images of game servers, each for game A, game B... Let's call a pod that's running game A's image for game A pod A .

Let's say we now have 3 x pod A , 5 x pod B . Now players rushing into game B, so I need let's say another 4 * pod B urgently.

I can surely spawn 4 more pod B . Kubernetes supports this perfectly. However there are 2 problems:

  • The booting of my game server is very slow (30s - 1min). Players don't want to wait.
  • More importantly for us, the cost of having this many pods that take up so much memory is very high. Because pods do not share memory as far as I know. Where as if it were plain old EC2 machine or bare metal, processes can share memory because they can fork and then copy-on-write.

Copy-on-write style forking and memory sharing seems to solve both problems.

...we have a few such docker images of game servers

...a pod that's running game A's image

Your program runs as the container, not the pod. This means you fork() in the container just as it was when you use docker as the orchestrator. If your k8s cluster container runtime is also docker, the Pod is just a logical host to your program.

One of Kubernetes' assumptions is that pods are scheduled on different Nodes, which contradicts the idea of sharing common resources (does not apply for storage where there are many options and documentation available ). The situation is different when it comes to sharing resources between containers in one pod , but for your issue this doesn't apply.

However, it seems that there is some possibility to share memory - not well documented and I guess very uncommon in Kubernetes. Check my answers with more details below:

Can Kubernetes pods share significant amount of memory?

What I found is that pods can share a common IPC with the host (node). You can checkPod Security Policies , especially field hostIPC :

HostIPC - Controls whether the pod containers can share the host IPC namespace.

Some usage examples and possible security issues can be found here :

Keep in mind that this solution is not common in Kubernetes. Pods with elevated privileges are granted broader permissions than needed :

The way PSPs are applied to Pods has proven confusing to nearly everyone that has attempted to use them. It is easy to accidentally grant broader permissions than intended, and difficult to inspect which PSP(s) apply in a given situation.

That's why the Kubernetes team marked Pod Security Policies as deprecated from Kubernetes v1.21 - check more information in this article .

Also, if you are using multiple nodes in your cluster you should use nodeSelector to make sure that pods will be assigned to same node that means they will be able to share one (host's) IPC.

Does copy-on-write style forking exist for pods?

I did a re-search and I didn't find any information about this possibility, so I think it is not possible.


I think the main issue is that your game architecture is not "very suitable" for Kubernetes. Check these articles and websites about dedicated game servers in Kubernetes- maybe you will them useful:

A different way to resolve the issue would be if some of the initialisation can be baked into the image.

As part of the docker image build, start up the game server and do as much of the 30s - 1min initialisation as possible, then dump that part of the memory into a file in the image. On game server boot-up, use mmap (with MAP_PRIVATE and possibly even MAP_FIXED ) to map the pre-calculated file into memory.

That would solve the problem with the game server boot-up time, and probably also with the memory use; everything in the stack should be doing copy-on-write all the way from the image through to the pod (although you'd have to confirm whether it actually does).

It would also have the benefit that it's plain k8s with no special tricks; no requirements for special permissions or node selection or anything, nothing to break or require reimplementation on upgrades or otherwise get in the way. You will be able to run it on any k8s cluster, whether your own or any of the cloud offerings, as well as in your CI/CD pipeline and dev machine.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM