简体   繁体   中英

What byte encoding is used to compute the Amazon-S3 ETag for multi-part uploads?

Reference the discussion in this link:

What is the algorithm to compute the Amazon-S3 Etag for a file larger than 5GB?

The steps to recreate the MD5 hash is to 1) concatenate the md5 hashes for each upload part, 2) convert the concatenated hash into binary, 3) get the md5 hash of the binary, then 4) add the hyphen and number of parts to the hash. That all sounds easy enough, but where I'm struggling is in step 3. To get the hash of the binary I need to convert the string into a byte array. To get the byte array I need to know what encoding format to use. That's the part I'm missing. Do I use ASCII, UTF8, Unicode, BigEndian, something else?

I've tried using the four formats above and none have produced the correct hash. I just can't seem to figure this one out. The code I'm using is:

CompleteMultipartUploadResponse compResp = new CompleteMultipartUploadResponse();
CompleteMultipartUploadRequest compReq = new CompleteMultipartUploadRequest();
string requestETagHash = "";

compResp = client.CompleteMultipartUpload(compReq);
string compETag = compResp.ETag;                                            
foreach (PartETag s in compReq.PartETags)
{
    requestETagHash += s.ETag.Replace('\"', ' ').Trim().Split('-').First();
}

StringBuilder sb = new StringBuilder();
foreach (char c in requestETagHash)
{
    try
    {
         sb.AppendFormat(Convert.ToString(Convert.ToInt16(c.ToString(), 16), 2).PadLeft(4, '0'));
    }
    catch (Exception ex)
    {
        MessageBox.Show("Hash error:\n\n" + ex.Message);
    }
}
//What encoding is used in this line?
byte[] b = System.Text.Encoding.UTF8.GetBytes(sb.ToString());

byte[] data = md5Hash.ComputeHash(b, 0, b.Length);

StringBuilder sBuilder = new StringBuilder();
for (int i = 0; i < data.Length; i++)
{
    sBuilder.Append(data[i].ToString("x2"));
}

Any in solving this would be appreciated.

Problem solved. Thank you, Jon! Your comment about my getting the hash late got me thinking about where to find the hash's byte array vs. the hex value I was using. I modified my code to get and concatenate the hash byte array immediately after uploading each file part. Then, after receiving the CompleteMultiPartUploadResponse response, I hash that concatenated array, and voila, I get the same hash as the eTag returned from S3 for the completed upload.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM