简体   繁体   中英

file compressed through command “pv” are different from ordinary compressed file

here is my script:

tar cf - testdir | pv -s $(du -sb testdir | awk '{print $1}') | pigz -1 > pv.tar.gz

tar cf - testdir | pigz -1 > nopv.tar.gz

diff pv.tar.gz nopv.tar.gz

and then the output is "Binary files pv.tar.gz and nopv.tar.gz differ".

I execute hexdump

and I found that only the first line of these two files is slightly different

pv.tar.gz: 8b1f 0008 9e24 5fc8 0304 bdec 5f7b c71b

nopv.tar.gz: 8b1f 0008 9c18 5fc8 0304 bdec 5f7b c71b

But after I unzipped it and compared it again, the testdir is exactly the same.

What I want to ask is, how can I make the two tar.gz files consistent?

It's not to do with pv . Bytes 5 to 8 in a gzip header are the timestamp. This will be different each time you run the command. You can tell pigz not to store it with the -m switch, so your commands are:

tar cf - testdir | pv -s $(du -sb testdir | awk '{print $1}') | pigz -1 -m > pv.tar.gz

tar cf - testdir | pigz -1 -m > nopv.tar.gz

which should give you the same content. You'll notice when you hexdump that the values that changed are all 00 now.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM