简体   繁体   中英

Comparing checksums of tarball archive with original directory

I'm wondering how to verify the checksum of a tarball backup with the original directory after creation.

Is it possible to do so without extracting it for example if it's a large 20GB backup?

Example, a directory with two files:

mkdir test &&
echo "one" > test/one.txt &&
echo "two" > test/two.txt

Get checksum of directory :

find test/ -type f -print0 | sort -z | xargs -0 shasum | shasum

Resulting checksum of directory content:

d191c793cacc4bec1f070eb96fa68524cca566f8  -

Create tarball:

tar -czf test.tar.gz test/

The checksum of the directory content stays constant.

But when creating the archive and getting the checksum of the archive I noticed that the results vary. Why is that?

How would I go about getting the checksum of the tarball content to compare to the directory content checksum?

Or what's a better solution to check that the archive contains all the necessary content from the original directory (without extracting it if it's large)?

Your directory checksum is calculating the SHA-1 of each file's contents. You would need to read and decompress the entire tar archive to do the same calculation. That doesn't mean you'd need to save the contents of the archive anywhere. You'd just need to read it sequentially into memory, and do the calculation there.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM