简体   繁体   中英

Java Base64 encoded string vs. .NET Base64 encoded string

Problem: I have a .NET HTTP Handler taking an HTTP POST of XML which is originating, I believe, from a Java system. One element contains a base64 string encoded document (current test file is a PDF). When I take the original PDF and generate a base64 string from .NET, there are some discrepancies between that and the corresponding text in the supplied XML.

There are a number of places where one of three things occurs:

  1. The XML file places a single space where .NET places a plus
  2. Similarly, the XML file has pair of consecutive spaces inserted vs. .NET's plusses

    PgplbmRv YmoKNSAw vs. PgplbmRv++YmoKNSAw

  3. Sometimes the XML file has pair of consecutive spaces inserted vs. .NET's plusses and additional spaces are added nearby in the XML's version

    3kuPs 85QZWYaw BsMNals vs. 3kuPs 85QZWYaw++BsMNals

  4. Source XML will have four spaces (display below looks like 2 spaces) vs. .NET's has a pair of consecutive plusses

    vGDmKEJ gnJeOK vs. vGDmKEJ++gnJeOK

Also, there are no plusses in the source (Java created?) data.

Questions: Can someone help identify what would cause these discrepancies might be? Most pressingly how I might address them as I can't see a reliable pattern against which to search and replace?

Edit: When the POST arrives, it does do URL decoding before deserializing to an object.

public void ProcessRequest(HttpContext context)
{
    try
    {
        StreamReader reader = new StreamReader(context.Request.InputStream);
        context.Response.ContentType = "text/plain";
        var decodedRequest = HttpUtility.UrlDecode(reader.ReadToEnd());
        ...

The plusses are likely being converted to spaces through some URLDecoding, in which spaces are represented by plusses. There shouldn't be any spaces in the actual base64 encoded result; space is an invalid character. Perhaps a simple search and replace could correct that, but you may want to identify how your result is being URLDecoded.

There were two issues.

  1. URL decoding translated existing pluses into spaces.
  2. The POSTing Java code was forcing a MIME-standard 76 character line length.

The URL decoding also translated CRLFs at line ends to double spaces. The CRLFs also cause an inflated document length which led to needing to reconsider the padding equal signs. The following code strips padding (and recalculates and appends later), returns spaces back to pluses and removes those that were CRLF placeholders.

// convert spaces to pluses and trim base64 spacers
char[] charDoc = doc.CONTENT.Replace(' ', '+').TrimEnd(new char[] {'='}).ToCharArray();

StringBuilder docBuilder = new StringBuilder();
for (int index = 0; index < charDoc.Length; index++)
{
    if ((index % 78 == 76) && (index < charDoc.Length - 1) && charDoc[index]  == '+' && charDoc[index + 1] == '+' )
    {
        index++;
        continue;
    }
    docBuilder.Append(charDoc[index]);
}
// Add padding, if needed--replicates 0-2 equals
docBuilder.Append(new string('=', (4 - docBuilder.Length % 4)%4));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM