简体   繁体   中英

Base64 Decoding breaks when encoding leads with +

Everytime i encode a string using Base64 and a + is added, the decoding will fail regarding the length of the string is invalid. If the encoding does not have the leading + it'll decode just fine. Can anyone please explain why this happens? What would cause the + sign to be generated on some cases? Example below, this string was encoded but can't be decoded.

+ueJ0q91t5XOnFYP8Xac3A== 

An example of a parameter i am passing would be in the following format prior to encoding, 123_true or 123_false. Would the "_" be causing the random issue with the "+" showing up?

+ is one of the regular base64 characters , used when the 6 bits being encoded have a value of 62.

My guess is that you're putting this in the query parameter of a URL, where + is the escaped value of space. For that use case, you should use a URL-safe base64 encoding instead:

Using standard Base64 in URL requires encoding of '+', '/' and '=' characters into special percent-encoded hexadecimal sequences ('+' becomes '%2B', '/' becomes '%2F' and '=' becomes '%3D'), which makes the string unnecessarily longer.

For this reason, modified Base64 for URL variants exist, where the '+' and '/' characters of standard Base64 are respectively replaced by '-' and '_', so that using URL encoders/decoders is no longer necessary and have no impact on the length of the encoded value, leaving the same encoded form intact for use in relational databases, web forms, and object identifiers in general. Some variants allow or require omitting the padding '=' signs to avoid them being confused with field separators, or require that any such padding be percent-encoded. Some libraries (like org.bouncycastle.util.encoders.UrlBase64Encoder) will encode '=' to '.'.

Exactly which path you choose here will depend on whether or not you control both sides - if you do, using the modified decodabet is probably the best plan. Otherwise, you need to just escape the query parameter.

Example below, this string was encoded but can't be decoded.

+ueJ0q91t5XOnFYP8Xac3A==

That's not true, in itself:

byte[] bytes = Convert.FromBase64String("+ueJ0q91t5XOnFYP8Xac3A==");

works fine... suggesting that it's the propagation of the string that's broken, which is in line with what I've said above.

Similar to the PHP solution for this problem, you can replace + , / and = with the safe characters - , _ , and ,

string safeBase64= base64.Replace('+', '-').Replace('/', '_').Replace('=', ',')

Just before decoding you can replace back the original characters:

string base64 = safeBase64.Replace('-','+').Replace('_','/').Replace(',','=')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM