简体   繁体   English

如何从编码的 base64 UTF 字符串中删除 BOM?

[英]How to remove BOM from an encoded base64 UTF string?

I have a file encoded in base64 using openssl base64 -in en -out en1 in a command line in MacOS and I am reading this file using the following code:我在 MacOS 的命令行中有一个使用openssl base64 -in en -out en1以 base64 编码的文件,我正在使用以下代码读取此文件:

string fileContent = File.ReadAllText(Path.Combine(AppContext.BaseDirectory, MConst.BASE_DIR, "en1"));
var b1 = Convert.FromBase64String(fileContent);
var str1 = System.Text.Encoding.UTF8.GetString(b1);

The string I am getting has a ?我得到的字符串有一个? before the actual file content.在实际文件内容之前。 I am not sure what's causing this, any help will be appreciated.我不确定是什么原因造成的,任何帮助将不胜感激。

Example Input:示例输入:

import pandas
import json

Encoded file example:编码文件示例:

77u/DQppbXBvcnQgY29ubmVjdG9yX2FwaQ0KaW1wb3J0IGpzb24NCg0K

Output based on the C# code:基于 C# 代码的输出:

?import pandas
import json

Normally, when you read UTF (with BOM) from a text file, the decoding is handled for you behind the scene.通常,当您从文本文件中读取 UTF(带 BOM)时,会在后台为您处理解码。 For example, both of the following lines will read UTF text correctly regardless of whether or not the text file has a BOM:例如,无论文本文件是否有 BOM,以下两行都将正确读取 UTF 文本:

File.ReadAllText(path, Encoding.UTF8);
File.ReadAllText(path); // UTF8 is the default.

The problem is that you're dealing with UTF text that has been encoded to a Base64 string.问题是您正在处理已编码为 Base64 字符串的 UTF 文本。 So, ReadAllText() can no longer handle the BOM for you.因此, ReadAllText()不能再为您处理 BOM。 You can either do it yourself by (checking and) removing the first 3 bytes from the byte array or delegate that job to a StreamReader, which is exactly what ReadAllText() does :您可以通过(检查并)从字节数组中删除前 3 个字节来自己完成,或者将该作业委托给 StreamReader, 这正是ReadAllText()所做的

var bytes = Convert.FromBase64String(fileContent);
string finalString = null;

using (var ms = new MemoryStream(bytes))
using (var reader = new StreamReader(ms))  // Or:
// using (var reader = new StreamReader(ms, Encoding.UTF8))
{
    finalString = reader.ReadToEnd();
}
// Proceed to using finalString.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM