简体   繁体   中英

Get length of base64 decoded data

I need to calculate the length of base64 decoded data.

I have Base-64 data that I am sending the unencoded data as the body of a HTTP response (typo: I meant request, but same idea).

I need to send a Content-Length header.

In the interest of memory usage and performance I'd rather not actually Base-64 decode the data all at once, but rather stream it.

Given base64 data, how do I calculate the length of the decoded data will be? I need either a general algorithm, or a Java/Scala solution.


EDIT: This is similar to, but not a duplicate of Calculate actual data size from Base64 encoded string length , where the OP asks

...can I calculate the length of the raw data that has been encoded only by looking at the length of the Base64-encoded string?

The answer is no. It is necessary to look at the padding as well.

I want to know how the length and the base64 data can be used to calculate the original length.

Assuming that you can't just use chunked encoding (and thereby avoid sending a Content-Length header), you need to consult the padding thus:

  • Base64 encodes three binary octets into four characters. You have 4N Base64 characters. Let k be the number of trailing '=' chars (ie padding chars: 0, 1 or 2).
  • Let M = 3*floor((Nk)/4), ie the number of octets in "complete" 3-octet chunks.
  • If you have 2 padding chars then you have M + 1 bytes.
  • If you have 1 padding char then you have M + 2 bytes.
  • If you have 0 padding chars then you have M bytes.

Of course, floor() in this case means truncating integer division, ie the normal / operator.

Presumably you can count padding octets relatively easily (eg by seeking to the end of a file, or by looking at the end of a byte array), without having to read the whole Base64-encoded thing sequentially.

I arrived at this simple calculation.

If L is the length of the Base-64 encoded data, and p is the number of padding characters (which will be 0, 1, or 2), then the length of the unencoded data is

L * 3 / 4 - p

In my case (with Scala),

bytes.length * 3 / 4 - bytes.reverseIterator.takeWhile(_ == '=').length

NOTE: This is assuming the the data does not have line separators. (Often, Base-64 data will have new lines every 72 characters or so.) If it does, exclude line separators from the length L .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM