简体   繁体   English

du如何估算文件大小?

[英]How does du estimate file size?

I am downloading a large file with wget , which I ran in the background with wget -bqc . 我正在使用wget下载大文件,该文件是在后台使用wget -bqc I wanted to see how much of the file was downloaded so I ran 我想看看下载了多少文件,所以我跑了

du -sh *

in the directory. 在目录中。 (I'd also be interested to know a better way to check wget progress in this case if anyone knows...) I saw that 25 GB had been downloaded, but for several attempts afterwards it showed the same result of 25 GB. (如果有人知道,我也想知道在这种情况下检查wget进度的更好方法。)我看到已经下载了25 GB,但是经过几次尝试,它显示了相同的25 GB结果。 I became worried that du had somehow interfered with the download until some time later when du showed a result of 33 GB and subsequently 40 GB. 我开始担心du会以某种方式干扰下载,直到一段时间后du显示33 GB的结果,随后又显示40 GB的结果。

In searching stackoverflow and online, I didn't find whether it is safe to use du on files being written to but I did see that it is only an estimate that can be somewhat off. 在搜索stackoverflow和在线时,我没有发现在写入的文件上使用du是否安全,但是我确实发现,这只是一个估计,可能有所偏差。 However, 7-8 GB seems like a lot, particularly because it is a single file, and not a directory tree, which it seems is what causes errors in the estimate. 但是,7-8 GB似乎很多,尤其是因为它是单个文件,而不是目录树,这似乎是导致估计错误的原因。 I'd be interested to know how it makes this estimate for a single file that is being written and why I would see this result. 我很想知道它是如何对正在写入的单个文件进行此估算的,以及为什么会看到此结果。

The operating system has to go guarantee safe access. 操作系统必须保证安全访问。

du does not estimate anything. du没有估计任何东西。 the kernel knows the size of the file and when du asks for it that's what it learns. 内核知道文件的大小,当du要求文件时,这就是它学到的东西。

If the file is in the range of gigabytes and the reported size is only with that granularity, it should not be a surprise that consecutive invocations show the same size - do you expect wget to fetch enough data to flip to another gigabyte in between your checks? 如果文件的大小在千兆字节范围内,并且报告的大小仅具有该粒度,则连续调用显示相同大小也就不足为奇了-您是否希望wget获取足够的数据以在两次检查之间翻转到另一个千兆字节? You can try running du without sh in order to get a more accurate read. 您可以尝试运行不带 sh的du以获得更准确的阅读。

Also wget will hold some amount of data in ram, but that should be negligible. wget还将在ram中保存一些数据,但这应该可以忽略不计。

du doesn't estimate, it sums up. du并没有估计,它是总结。 But it has access to some file-system-internal information which might make its output be a surprise. 但是它可以访问某些文件系统内部信息,这可能会使它的输出令人惊讶。 The various aspects should be looked up separately as they are a bit too much to explain here in detail. 各个方面应单独查找,因为它们太多了,无法在此处详细说明。

  1. Sparse files may make a file look bigger than it is on disk. 稀疏文件可能会使文件看起来比磁盘上的文件大。
  2. Hard links may make a directory tree look bigger than it is on disk. 硬链接可能会使目录树看起来比磁盘上的目录树大。
  3. Block sizes may make a file look smaller than it is on disk. 块大小可能会使文件看起来比磁盘上的文件小。

du will always print out the size a directory tree (or several) actually and really occupy on disk. du总是会打印出一个目录树(或几个目录树)的大小,实际上它确实会占用磁盘。 Due to various facts (the three most common are given above) this can be different from the size of the information stored in theses trees. 由于各种事实(上面给出了三个最常见的事实),这可能与这些树中存储的信息大小不同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM