When I launch a fresh Ubuntu machine (EC2) and download a single docker image which I run for a long time, after a couple weeks the disk fills up. How do I prevent this from happening?
Everything I find online talks about running docker prune, but my issue is not related to lots of stray docker images or volumes sitting around. This EC2 instance downloads a single image and launches it only once (and keeps it running forever, this is a CI runner).
Here are some clues:
docker pull
the image it's only 2.5 GB (it's an ubuntu minimal image) docker run -it -d --rm --shm-size=2gb --env --user root --name running-docker-ci ghcr.io/secret/docker-ci:latest start
Here is the diagnosis I've done:
$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 10098432 10082048 0 100% /
devtmpfs 8192212 0 8192212 0% /dev
tmpfs 8198028 0 8198028 0% /dev/shm
tmpfs 1639608 164876 1474732 11% /run
tmpfs 5120 0 5120 0% /run/lock
tmpfs 8198028 0 8198028 0% /sys/fs/cgroup
/dev/loop0 34176 34176 0 100% /snap/amazon-ssm-agent/3552
/dev/loop1 56832 56832 0 100% /snap/core18/1988
/dev/loop4 33152 33152 0 100% /snap/snapd/11588
/dev/loop5 56832 56832 0 100% /snap/core18/1997
/dev/loop6 72192 72192 0 100% /snap/lxd/19647
/dev/loop7 69248 69248 0 100% /snap/lxd/20326
/dev/loop2 32896 32896 0 100% /snap/snapd/11841
tmpfs 1639604 0 1639604 0% /run/user/1000
And running du
a lot led me to this being my biggest folder:
/var/lib/docker$ sudo du -s * | sort -nr | head -50
13842100 overlay2
14888 image
128 containers
72 buildkit
56 network
28 volumes
20 plugins
20 builder
4 trust
4 tmp
4 swarm
4 runtimes
Any help? I'm stumped.
Add more details:
larsks Suggested maybe this is inside the container. It doesn't appear to be. I don't have anything running that generates files. Oddly I noticed that df
shows 8 gigs are used by the overlay
file system:
$ df
Filesystem 1K-blocks Used Available Use% Mounted on
overlay 8065444 8049060 0 100% /
tmpfs 65536 0 65536 0% /dev
tmpfs 8198028 0 8198028 0% /sys/fs/cgroup
shm 2097152 16 2097136 1% /dev/shm
/dev/root 8065444 8049060 0 100% /etc/hosts
tmpfs 8198028 0 8198028 0% /proc/acpi
tmpfs 8198028 0 8198028 0% /proc/scsi
tmpfs 8198028 0 8198028 0% /sys/firmware
But when do du
on the directory tree, it does not add up anywhere close to 8 gigs. I ran this from the root of the file system inside the running container:
$ sudo du -s * | sort -nr | head -50
3945724 home
1094712 usr
254652 opt
151984 var
3080 etc
252 run
192 tmp
24 root
16 dev
4 srv
4 mnt
4 media
4 boot
0 sys
0 sbin
0 proc
0 libx32
0 lib64
0 lib32
0 lib
0 bin
It appears that part of how OverlayFS works is that delete operations don't always free up space in your filesystem. Fromthe docs :
- Deleting files and directories:
When a file is deleted within a container, a whiteout file is created in the container (upperdir). The version of the file in the image layer (lowerdir) is not deleted (because the lowerdir is read-only). However, the whiteout file prevents it from being available to the container.
When a directory is deleted within a container, an opaque directory is created within the container (upperdir). This works in the same way as a whiteout file and effectively prevents the directory from being accessed, even though it still exists in the image (lowerdir).
Without knowing your CI procedures it's hard to say precisely, but the point remains that if you think you're removing files, it's likely that the filesystem is retaining some or all of their contents.
Just as an aside, since you mentioned you're on AWS, you might consider a serverless CI deployment so that your container(s) starts from a clean slate on every run.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.