简体   繁体   中英

Why do you need to immediately verify checksums when uploading or downloading from an object storage system?

Object storage systems like AWS S3 and Google Cloud Storage discuss the need to check the integrity of downloaded and uploaded objects, immediately after transfer, to ensure no corruption occurred.

For example, the AWS CLI doc mentions:

Upload: The AWS CLI will calculate and auto-populate the Content-MD5 header for both standard and multipart uploads. If the checksum that S3 calculates does not match the Content-MD5 provided, S3 will not store the object and instead will return an error message back the AWS CLI.

Download: The AWS CLI will attempt to verify the checksum of downloads when possible, based on the ETag header returned from a GetObject request that's performed whenever the AWS CLI downloads objects from S3. If the calculated MD5 checksum does not match the expected checksum, the file is deleted and the download is retried.

Given that TCP incorporates automatic integrity checking, why do these systems require an additional checksum to verify integrity? It seems like by using TCP we should be able to ensure corruption did not occur in transfer.

There could be many reasons, but the first one that comes to mind is verification that the client's payload didn't get corrupted (or maliciously modified) while it was reading from the data source, before the data actually got transferred. Similarly, there could be corruption writing to the storage on the cloud end. Using a checksum on both ends is a way to hedge against that, even if it's highly unlikely.

TCP (and UDP) checksums will not always protect you, and they have been known to be weak for years (if not decades), a quick search yield these, I am sure you can find other (and maybe better) references:

https://www.evanjones.ca/tcp-and-ethe.net-checksums-fail.html https://www.evanjones.ca/tcp-checksums.html

That is not the only reason, you may experience data corruption in your local disk, or your CPU may be corrupting bits when encrypting the data (it does happen), or some other uncommon problem.

More generally, all these systems are designed to handle the corner cases and odd situations which happen so rarely that most of us will not experience them in decades. But because these systems are used by so many people, and read/write so many bytes that they experience them daily. In other words: rare events at large enough scale happen often.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM