简体   繁体   English

Kafka 日志目录中文件的总大小小于它们大小的总和

[英]Total size of files in Kafka logs directory is less than the sum of their sizes

I'm testing a Kafka producer application and noticed something strange about the disk usage of the Kafka logs.我正在测试 Kafka 生产者应用程序,并注意到 Kafka 日志的磁盘使用情况有些奇怪。 When looking at the total size of a certain partition's log directory, while the application is writing to Kafka, I see this:在查看某个分区日志目录的总大小时,当应用程序正在写入 Kafka 时,我看到:

$ ls -l --block-size=kB kafka-logs/mytopic-0
total 52311kB
-rw-rw-r-- 1 app-data app-data 10486kB Oct 29 12:45 00000000000000000000.index
-rw-rw-r-- 1 app-data app-data 46505kB Oct 29 12:45 00000000000000000000.log
-rw-rw-r-- 1 app-data app-data 10486kB Oct 29 12:45 00000000000000000000.timeindex
-rw-rw-r-- 1 app-data app-data     1kB Oct 29 11:55 leader-epoch-checkpoint

Then I stop my application, and a few minutes later I repeat the above command, and get this:然后我停止我的应用程序,几分钟后我重复上面的命令,得到这个:

$ ls -l --block-size=kB kafka-logs/mytopic-0
total 46519kB
-rw-rw-r-- 1 app-data app-data 10486kB Oct 29 12:45 00000000000000000000.index
-rw-rw-r-- 1 app-data app-data 46505kB Oct 29 12:45 00000000000000000000.log
-rw-rw-r-- 1 app-data app-data 10486kB Oct 29 12:45 00000000000000000000.timeindex
-rw-rw-r-- 1 app-data app-data     1kB Oct 29 11:55 leader-epoch-checkpoint

Questions: Why does the ls total figure not represent the sum of sizes of all the files in that directory?问题:为什么ls total 数字不代表该目录中所有文件的大小总和? Why does the total decrease a few minutes after stopping the producer application, even though all the files in the directory remain the same size?为什么在停止生产者应用程序几分钟后总数会减少,即使目录中的所有文件都保持相同的大小?

The files might have holes.文件可能有漏洞。 Can you run following commands :您可以运行以下命令:

du --apparent-size *

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM