简体   繁体   English

Excel CSV编码问题

[英]Excel CSV Encoding issues

I have a question about danish characters and open saved file as CSV in Excel. 我有一个关于丹麦字符的问题,并在Excel中以CSV格式打开保存的文件。 See the code below: 请参见下面的代码:

        [HttpGet]
        [Route("/progress/data.csv")]
        [Produces("text/csv")]
        public IActionResult GetCSV()
        {
            StringBuilder sb = new StringBuilder();
            sb.AppendLine("æø;2;3;");
            Encoding encode = Encoding.UTF8;
            return File(encode.GetBytes(sb.ToString()), "text/csv", "data.csv");
        }

I am using .NET Core 2.1 and the result of this export is that the two first characters æø are displayed as æà . 我正在使用.NET Core 2.1,并且此导出的结果是,两个第一个字符æø显示为æÃ。

I am aware that this is a known problem but I have so far not found a solution for it. 我知道这是一个已知的问题,但到目前为止我还没有找到解决方案。 During the last 4 hours I have tried at least 15 different ways, including UTF with/without BOM. 在过去的4个小时中,我至少尝试了15种不同的方法,包括使用带/不带BOM的UTF。 Manually adding BOM with System.Text.Encoding.UTF8.GetPreamble(), various MemoryStream, StreamWriter solutions, tried using windows-1252 with CodePagesEncodingProvider.Instance.GetEncoding(1252) but nothing works. 尝试通过System.Text.Encoding.UTF8.GetPreamble(),各种MemoryStream,StreamWriter解决方案手动添加BOM表,并尝试将Windows-1252与CodePagesEncodingProvider.Instance.GetEncoding(1252)结合使用,但没有任何效果。 When open this file up in Excel the result is always soemthing different than expected. 在Excel中打开此文件时,结果总是与预期不同。

Anyone that has a solution for this? 有解决方案的人吗?

Well ,the problem is the way of Excel to deal with BOM . 嗯,问题是Excel处理BOM的方式。 You might found out to use a StreamWriter : 您可能发现使用StreamWriter

StreamWriter defaults to using an instance of UTF8Encoding unless specified otherwise. 除非另有说明,否则StreamWriter默认使用UTF8Encoding的实例。 This instance of UTF8Encoding is constructed without a byte order mark (BOM), so its GetPreamble method returns an empty byte array. 该UTF8Encoding实例的构造没有字节顺序标记(BOM),因此其GetPreamble方法返回一个空字节数组。 The default UTF-8 encoding for this constructor throws an exception on invalid bytes. 此构造函数的默认UTF-8编码在无效字节上引发异常。 This behavior is different from the behavior provided by the encoding object in the Encoding.UTF8 property. 此行为与Encoding.UTF8属性中的编码对象提供的行为不同。 To specify a BOM and determine whether an exception is thrown on invalid bytes, use a constructor that accepts an encoding object as a parameter, such as StreamWriter(String, Boolean, Encoding) or StreamWriter. 要指定BOM并确定是否对无效字节抛出异常,请使用接受编码对象作为参数的构造函数,例如StreamWriter(String,Boolean,Encoding)或StreamWriter。

So I just create a custom implementation of IActionResult : 所以我只是创建一个IActionResult的自定义实现:

public class Utf8ForExcelCsvResult : IActionResult
{
    public string Content{get;set;}
    public string ContentType{get;set;}
    public string FileName {get;set;}
    public Task ExecuteResultAsync(ActionContext context)
    {
        var Response =context.HttpContext.Response;
        Response.Headers["Content-Type"] = this.ContentType;
        Response.Headers["Content-Disposition"]=$"attachment; filename={this.FileName}; filename*=UTF-8''{this.FileName}";
        using(var sw = new StreamWriter(Response.Body,System.Text.Encoding.UTF8)){
            sw.Write(Content);
        }
        return Task.CompletedTask ;
    }
}

When you need open such a csv file using Excel , simply return a Utf8ForExcelCsvResult . 当您需要使用Excel打开这样的csv文件时,只需返回Utf8ForExcelCsvResult

[HttpGet]
[Route("/progress/data.csv")]
[Produces("text/csv")]
public IActionResult MyFileDownload()
// public Utf8ForExcelCsvResult MyFileDownload()
{
    StringBuilder sb = new StringBuilder();
    sb.AppendLine("æø;2;3;");
    sb.AppendLine("გამარჯობა");
    sb.AppendLine("ဟယ်လို");
    sb.AppendLine("ສະບາຍດີ");
    sb.AppendLine("cześć");
    sb.AppendLine("こんにちは");
    sb.AppendLine("你好");
    Console.WriteLine(sb.ToString());
    return new Utf8ForExcelCsvResult(){
        Content=sb.ToString(),
        ContentType="text/csv",
        FileName="hello.csv",
    };
}

在此处输入图片说明

We can use Powershell to inspect the HEX representation of csv file by Format-hex -path .\\hello.csv : 我们可以使用Powershell通过Format-hex -path .\\hello.csv检查csv文件的十六进制表示形式:

           00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000   EF BB BF C3 A6 C3 B8 3B 32 3B 33 3B 0D 0A E1 83  æø;2;3;..á
00000010   92 E1 83 90 E1 83 9B E1 83 90 E1 83 A0 E1 83 AF  ááá á¯
00000020   E1 83 9D E1 83 91 E1 83 90 0D 0A E1 80 9F E1 80  ááá..áá
00000030   9A E1 80 BA E1 80 9C E1 80 AD E1 80 AF 0D 0A E0  áºáá­á¯..à
00000040   BA AA E0 BA B0 E0 BA 9A E0 BA B2 E0 BA 8D E0 BA  ºªàº°àºàº²àºàº
00000050   94 E0 BA B5 0D 0A 63 7A 65 C5 9B C4 87 0D 0A E3  ີ..czeÅ..ã
00000060   81 93 E3 82 93 E3 81 AB E3 81 A1 E3 81 AF 0D 0A  ãã«ã¡ã¯..
00000070   E4 BD A0 E5 A5 BD 0D 0A                          你好..

Here the first three bytes EF BB BF are the Byte order marks 这里的前三个字节EF BB BF字节顺序标记

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM