C# 中的西班牙语文件的编码问题

Question

我在西班牙语的 azure blob 存储中有一个在线文件存储。 有些单词有特殊字符（例如：Almacén）当我在 notepad++ 中打开文件时，编码是 ANSI。

所以现在我尝试使用代码读取文件：

        using StreamReader reader = new StreamReader(Stream, Encoding.UTF8);
        blobStream.Seek(0, SeekOrigin.Begin);
        var allLines = await reader.ReadToEndAsync();

问题是“allLines”不是正确的编码，我有一些问题，例如：Almac�n

我尝试了一些类似的解决方案： C# Convert string from UTF-8 to ISO-8859-1 (Latin1) H

但仍然无法正常工作

(the final goal is to "merge" two csv so I read the stream of both, remove the header and concatenate the string to push it again. If there is a better solution to merge csv in c# that can skip this encoding issue I am也对它开放）

Answer 1

您正在尝试读取非 UTF8 编码的文件，就好像它是 UTF8 编码的一样。 我可以复制这个问题

var s = "Almacén";
using var memStream = new MemoryStream(Encoding.GetEncoding(28591).GetBytes(s));

using var reader = new StreamReader(memStream, Encoding.UTF8);
var allLines = await reader.ReadToEndAsync();

Console.WriteLine(allLines); // writes "Almac�n" to console

您应该尝试使用编码为 iso-8859-1 "Western European (ISO)" 的文件来读取文件，即代码页 28591。

using var reader = new StreamReader(Stream, Encoding.GetEncoding(28591));
var allLines = await reader.ReadToEndAsync();

C# 中的西班牙语文件的编码问题

问题描述

1 个解决方案

解决方案1
2 已采纳 2021-01-08 11:28:56

C# 中的西班牙语文件的编码问题

问题描述

1 个解决方案

解决方案1 2 已采纳 2021-01-08 11:28:56

解决方案1
2 已采纳 2021-01-08 11:28:56