简体   繁体   English

Java Base64编码的字符串与.NET Base64编码的字符串

[英]Java Base64 encoded string vs. .NET Base64 encoded string

Problem: I have a .NET HTTP Handler taking an HTTP POST of XML which is originating, I believe, from a Java system. 问题:我有一个.NET HTTP Handler接受XML的HTTP POST,我相信它来自Java系统。 One element contains a base64 string encoded document (current test file is a PDF). 一个元素包含base64字符串编码文档(当前测试文件是PDF)。 When I take the original PDF and generate a base64 string from .NET, there are some discrepancies between that and the corresponding text in the supplied XML. 当我从原始PDF获取并从.NET生成base64字符串时,它与提供的XML中的相应文本之间存在一些差异。

There are a number of places where one of three things occurs: 有很多地方出现以下三种情况之一:

  1. The XML file places a single space where .NET places a plus XML文件放置一个.NET放置加号的空间
  2. Similarly, the XML file has pair of consecutive spaces inserted vs. .NET's plusses 类似地,XML文件插入了一对连续空格,而不是.NET的加号

    PgplbmRv YmoKNSAw vs. PgplbmRv++YmoKNSAw PgplbmRv YmoKNSAw vs. PgplbmRv++YmoKNSAw

  3. Sometimes the XML file has pair of consecutive spaces inserted vs. .NET's plusses and additional spaces are added nearby in the XML's version 有时XML文件插入了一对连续的空格而不是.NET的版本并且在XML版本中附近添加了额外的空格

    3kuPs 85QZWYaw BsMNals vs. 3kuPs 85QZWYaw++BsMNals 3kuPs 85QZWYaw BsMNals vs. 3kuPs 85QZWYaw++BsMNals

  4. Source XML will have four spaces (display below looks like 2 spaces) vs. .NET's has a pair of consecutive plusses 源XML将有四个空格(下面显示为2个空格),而.NET则有一对连续的加号

    vGDmKEJ gnJeOK vs. vGDmKEJ++gnJeOK vGDmKEJ gnJeOK vs. vGDmKEJ++gnJeOK

Also, there are no plusses in the source (Java created?) data. 此外,源(Java创建?)数据中没有加号。

Questions: Can someone help identify what would cause these discrepancies might be? 问题:有人可以帮助确定导致这些差异的原因可能是什么? Most pressingly how I might address them as I can't see a reliable pattern against which to search and replace? 最紧迫的是我如何解决它们,因为我看不到可靠的搜索和替换模式?

Edit: When the POST arrives, it does do URL decoding before deserializing to an object. 编辑:当POST到达时,它会在反序列化到对象之前进行URL解码。

public void ProcessRequest(HttpContext context)
{
    try
    {
        StreamReader reader = new StreamReader(context.Request.InputStream);
        context.Response.ContentType = "text/plain";
        var decodedRequest = HttpUtility.UrlDecode(reader.ReadToEnd());
        ...

The plusses are likely being converted to spaces through some URLDecoding, in which spaces are represented by plusses. 加号可能通过一些URLDecoding转换为空格,其中空格由加号表示。 There shouldn't be any spaces in the actual base64 encoded result; 实际的base64编码结果中不应该有任何空格; space is an invalid character. 空格是一个无效的字符。 Perhaps a simple search and replace could correct that, but you may want to identify how your result is being URLDecoded. 也许简单的搜索和替换可以纠正这一点,但您可能想要确定您的结果是如何被URLDecoded。

There were two issues. 有两个问题。

  1. URL decoding translated existing pluses into spaces. URL解码将现有的内容翻译成空格。
  2. The POSTing Java code was forcing a MIME-standard 76 character line length. POSTing Java代码强制使用MIME标准76个字符的行长度。

The URL decoding also translated CRLFs at line ends to double spaces. URL解码还将行末端的CRLF转换为双倍空格。 The CRLFs also cause an inflated document length which led to needing to reconsider the padding equal signs. CRLF还导致文件长度膨胀,导致需要重新考虑填充等号。 The following code strips padding (and recalculates and appends later), returns spaces back to pluses and removes those that were CRLF placeholders. 以下代码剥离填充(并稍后重新计算和追加),将空格返回到加号并删除那些作为CRLF占位符的空格。

// convert spaces to pluses and trim base64 spacers
char[] charDoc = doc.CONTENT.Replace(' ', '+').TrimEnd(new char[] {'='}).ToCharArray();

StringBuilder docBuilder = new StringBuilder();
for (int index = 0; index < charDoc.Length; index++)
{
    if ((index % 78 == 76) && (index < charDoc.Length - 1) && charDoc[index]  == '+' && charDoc[index + 1] == '+' )
    {
        index++;
        continue;
    }
    docBuilder.Append(charDoc[index]);
}
// Add padding, if needed--replicates 0-2 equals
docBuilder.Append(new string('=', (4 - docBuilder.Length % 4)%4));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM