简体   繁体   English

使用tar压缩目录时如何排除大文件

[英]How to exclude big files while compressing a directory with tar

I want to compress a directory in Linux.我想在 Linux 中压缩一个目录。 I created a tar.gz that it turns to be a big file, due to the reason that the directory contains some *.o files and some pdf files.我创建了一个tar.gz ,它变成了一个大文件,因为该目录包含一些 *.o 文件和一些 pdf 文件。

Is there any way to compress a directory but exclude files larger than a predefined SIZE?有没有办法压缩目录但排除大于预定义 SIZE 的文件 There is a --exclude argument in tar command, however I would like to reject files larger than 1 MB. tar 命令中有一个 --exclude 参数,但是我想拒绝大于 1 MB 的文件。 This is the constrain, not the name of the file.这是约束,而不是文件名。

Based on Jan-Philip Gehrcke's response:基于 Jan-Philip Gehrcke 的回应:

find . -type f -size -1024k -print0 | tar -czf --null -T - -f archive.tar.gz

for files less than 1M.对于小于 1M 的文件。 Tested on OS X and Ubuntu Linux.在 OS X 和 Ubuntu Linux 上测试。

The ...| tar c --null -T - ...| tar c --null -T - ...| tar c --null -T - solution above is the best if you have adequate memory (ie the file list fits into your memory easily (in most cases, this is true)). ...| tar c --null -T -如果您有足够的内存(即文件列表很容易适合您的内存(在大多数情况下,这是真的)),上面的解决方案是最好的。 However, xargs does have a place if you are memory-constrained, but you have to use it appropriately so that the multiple tar invocations have no ill effect.但是,如果您的内存受限, xargs确实有一席之地,但您必须适当地使用它,以便多次 tar 调用不会产生不良影响。

To compress, you may use:要压缩,您可以使用:

find . -type f -size -1024k | xargs tar c | gzip > archive.tar.gz

This results in a file of concatenated tar archives, gzipped together into the resulting file (you may also use cz and omit | gzip as concatenating gzip archives is still valid gzip, but you lose a tiny bit of compression, or quite a bit of compression if you use bzip2 or xz instead of gzip).这会产生一个由连接的 tar 档案组成的文件,gzip 一起压缩到生成的文件中(您也可以使用cz和 omit | gzip因为连接的 gzip 档案仍然是有效的 gzip,但是您失去了一点点压缩,或者相当多的压缩如果您使用 bzip2 或 xz 而不是 gzip)。

To extract the resulting file you have to use the --ignore-zeros or -i option of tar to not only extract the first archive:要提取生成的文件,您必须使用 tar 的--ignore-zeros-i选项不仅提取第一个存档:

tar xizf archive.tar.gz

You could use a combination of find (with its -size flag) and xargs to pass it into tar.您可以结合使用 find (及其 -size 标志)和 xargs 将其传递到 tar 中。

Something like:就像是:

find . -size -100k -print | xargs tar cvf archive.tar

for files less than 100k.对于小于 100k 的文件。 See man find for the other size options有关其他尺寸选项,请参阅 man find

find ./myRep/ -type f -size -1024k |找到 ./myRep/ -type f -size -1024k | xargs tar cfvz myArchive.tar xargs tar cfvz myArchive.tar

In a word, first part of this expression construct a list of files that size is lower than 1024k recursively from ./myRep/ and second part create tar/gzip archive.总之,这个表达式的第一部分从 ./myRep/ 递归构建一个小于 1024k 的文件列表,第二部分创建 tar/gzip 存档。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM