简体   繁体   English

Base64字符串抛出无效字符错误

[英]Base64 String throwing invalid character error

I keep getting a Base64 invalid character error even though I shouldn't. 即使我不应该,我仍然会收到Base64无效字符错误。

The program takes an XML file and exports it to a document. 该程序获取XML文件并将其导出到文档。 If the user wants, it will compress the file as well. 如果用户想要,它也会压缩文件。 The compression works fine and returns a Base64 String which is encoded into UTF-8 and written to a file. 压缩工作正常并返回一个Base64字符串,该字符串被编码为UTF-8并写入文件。

When its time to reload the document into the program I have to check whether its compressed or not, the code is simply: 当它将文档重新加载到程序中的时候我必须检查它是否被压缩,代码只是:

byte[] gzBuffer = System.Convert.FromBase64String(text);
return "1F-8B-08" == BitConverter.ToString(new List<Byte>(gzBuffer).GetRange(4, 3).ToArray());

It checks the beginning of the string to see if it has GZips code in it. 它检查字符串的开头以查看其中是否包含GZips代码。

Now the thing is, all my tests work. 现在的问题是,我的所有测试都有效。 I take a string, compress it, decompress it, and compare it to the original. 我拿一根绳子,压缩它,解压缩,并将它与原始的相比较。 The problem is when I get the string returned from an ADO Recordset. 问题是当我从ADO Recordset返回字符串时。 The string is exactly what was written to the file (with the addition of a "\\0" at the end, but I don't think that even does anything, even trimmed off it still throws). 字符串正是写入文件的内容(最后添加了一个“\\ 0”,但我认为它甚至没有做任何事情,即使修剪它仍然会抛出)。 I even copy and pasted the entire string into a test method and compress/decompress that. 我甚至将整个字符串复制并粘贴到测试方法中并压缩/解压缩。 Works fine. 工作正常。

The tests will pass but the code will fail using the exact same string? 测试将通过,但代码将使用完全相同的字符串失败? The only difference is instead of just declaring a regular string and passing it in I'm getting one returned from a recordset. 唯一的区别是,只是声明一个常规字符串并传递它,我从记录集返回一个。

Any ideas on what am I doing wrong? 关于我做错什么的任何想法?

You say 你说

The string is exactly what was written to the file (with the addition of a "\\0" at the end, but I don't think that even does anything). 字符串正是写入文件的内容(最后添加了“\\ 0”,但我认为甚至没有做任何事情)。

In fact, it does do something (it causes your code to throw a FormatException :"Invalid character in a Base-64 string") because the Convert.FromBase64String does not consider "\\0" to be a valid Base64 character. 事实上,它确实做了一些事情(它导致你的代码抛出FormatException :“Base-64字符串中的无效字符”)因为Convert.FromBase64String不认为“\\ 0”是有效的Base64字符。

  byte[] data1 = Convert.FromBase64String("AAAA\0"); // Throws exception
  byte[] data2 = Convert.FromBase64String("AAAA");   // Works

Solution: Get rid of the zero termination. 解决方案:摆脱零终止。 (Maybe call .Trim("\\0") ) (也许叫.Trim("\\0")

Notes : 备注

The MSDN docs for Convert.FromBase64String say it will throw a FormatException when Convert.FromBase64StringMSDN文档说它会在什么时候抛出FormatException

The length of s, ignoring white space characters, is not zero or a multiple of 4. 忽略空格字符的s的长度不为零或为4的倍数。

-or- -要么-

The format of s is invalid. s的格式无效。 s contains a non-base 64 character, more than two padding characters, or a non-white space character among the padding characters. s包含非基本64个字符,两个以上的填充字符或填充字符中的非空白字符。

and that 然后

The base 64 digits in ascending order from zero are the uppercase characters 'A' to 'Z', lowercase characters 'a' to 'z', numerals '0' to '9', and the symbols '+' and '/'. 从零开始按升序排列的基数为64位的是大写字母“A”到“Z”,小写字母“a”到“z”,数字“0”到“9”,符号“+”和“/” 。

Whether null char is allowed or not really depends on base64 codec in question. 是否允许使用null char实际上取决于所讨论的base64编解码器。 Given vagueness of Base64 standard (there is no authoritative exact specification), many implementations would just ignore it as white space. 鉴于Base64标准的模糊性(没有权威的确切规范),许多实现只会忽略它作为空白。 And then others can flag it as a problem. 然后其他人可以将其标记为问题。 And buggiest ones wouldn't notice and would happily try decoding it... :-/ 最吵闹的人不会注意到,并乐意尝试解码...: - /

But it sounds c# implementation does not like it (which is one valid approach) so if removing it helps, that should be done. 但它听起来c#实现不喜欢它(这是一种有效的方法)所以如果删除它有帮助,那应该这样做。

One minor additional comment: UTF-8 is not a requirement, ISO-8859-x aka Latin-x, and 7-bit Ascii would work as well. 另外一个小评论:UTF-8不是必需的,ISO-8859-x又名Latin-x,7位Ascii也可以。 This because Base64 was specifically designed to only use 7-bit subset which works with all 7-bit ascii compatible encodings. 这是因为Base64专门设计为仅使用7位子集,该子集适用于所有7位ascii兼容编码。

从字符串转换Base64的一个问题是,某些转换函数使用前面的“data:image / jpg; base64”,而其他转换函数只接受实际数据。

string stringToDecrypt = HttpContext.Current.Request.QueryString.ToString()

//更改为字符串stringToDecrypt = HttpUtility.UrlDecode(HttpContext.Current.Request.QueryString.ToString())

如果从字符串末尾删除\\ 0是不可能的,则可以为编码的每个字符串添加自己的字符,并在解码时将其删除。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM