简体   繁体   English

在C#中读取CSV文件时出现编码字符串问题

[英]Encoding string issue reading a CSV file in C#

I am currently developing a Windows Phone 8 application in which one I have to download a CSV file from a web-service and convert data to a C# business object (I do not use a library for this part). 我目前正在开发Windows Phone 8应用程序,在该应用程序中,我必须从Web服务下载CSV文件并将数据转换为C#业务对象(本部分不使用库)。

Download the file and convert data to a C# business object is not an issue using RestSharp.Portable, StreamReader class and MemoryStream class. 使用RestSharp.Portable, StreamReader类和MemoryStream类下载文件并将数据转换为C#业务对象不是问题。

The issue I face to is about the bad encoding of the string fields. 我面临的问题是有关字符串字段的编码错误。

With the library RestSharp.Portable, I retrieve the csv file content as a byte array and then convert data to string with the following code (where response is a byte array) : 使用库RestSharp.Portable,我以字节数组的形式检索csv文件的内容,然后使用以下代码将数据转换为字符串( response为字节数组):

using (var streamReader = new StreamReader(new MemoryStream(response)))
{
  while (streamReader.Peek() >= 0)
  {
    var csvLine = streamReader.ReadLine();
  }
}

but instead of "Jérome", my csvLine variable contains J rome . 但我的csvLine变量不是csvLinecsvLine包含J rome I tried several things to obtain Jérome but without success like : 我尝试了几种方法来获得Jérome但没有成功,例如:

using (var streamReader = new StreamReader(new MemoryStream(response), true))

or 要么

using (var streamReader = new StreamReader(new MemoryStream(response), Encoding.UTF8))

When I open the CSV file with a simple notepad software like notepad++ I obtain Jérome only when the file is encoding in ANSI. 当我使用简单的记事本软件(例如notepad ++)打开CSV文件时,仅当文件以ANSI编码时,我才获得Jérome But if I try the following code in C# : 但是,如果我在C#中尝试以下代码:

using (var streamReader = new StreamReader(new MemoryStream(response), Encoding.GetEncoding("ANSI")))

I have the following exception : 我有以下例外:

'ANSI' is not a supported encoding name. 'ANSI'不是受支持的编码名称。

Can someone help me to decode correctly my CSV file ? 有人可以帮助我正确解码CSV文件吗?

Thank you in advance for your help or advices ! 预先感谢您的帮助或建议!

You need to pick one of these. 您需要选择其中之一。

https://msdn.microsoft.com/en-us/library/windows/desktop/dd317756(v=vs.85).aspx https://msdn.microsoft.com/zh-CN/library/windows/desktop/dd317756(v=vs.85).aspx

If you don't know, you can try to guess it. 如果您不知道,可以尝试猜测。 Guessing isn't a perfect solution, per the answer here . 根据此处的答案,猜测并不是一个完美的解决方案。

You can't detect the codepage, you need to be told it. 您无法检测到代码页,需要告知它。 You can analyse the bytes and guess it, but that can give some bizarre (sometimes amusing) results. 您可以分析字节并进行猜测,但这会带来一些奇怪(有时很有趣)的结果。

From the link of Lawtonfogle I tried to use 从Lawtonfogle的链接,我尝试使用

using (var streamReader = new StreamReader(new MemoryStream(response), Encoding.GetEncoding("Windows-1252")))

But I had the following error : 但是我有以下错误:

'Windows-1252' is not a supported encoding name. “ Windows-1252”不是受支持的编码名称。

Searching why on the internet, I finally found following thread with the following answer that works for me. 在互联网上搜索原因之后,我终于找到了以下主题 ,并为我提供了以下答案

So here the working solution in my case : 所以这是我的情况下可行的解决方案:

using (var streamReader = new StreamReader(new MemoryStream(response), Encoding.GetEncoding("ISO-8859-1")))
{
  while (streamReader.Peek() >= 0)
  {
    var csvLine = streamReader.ReadLine();
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM