简体   繁体   English

扩展的ASCII字符,例如欧元符号被转换为其等效的unicode

[英]Extended ASCII characters such as euro symbol being converted to its unicode equivalent

I have the euro symbol stored in an MS-Access database table: 我将欧元符号存储在MS-Access数据库表中:

SELECT
CurrencySymbol,
Len(CurrencySymbol) AS DataLength,
Asc(CurrencySymbol) AS AsciiCode
FROM table1;

CurrencySymbol DataLength AsciiCode
-------------- ---------- ---------
€              1          128

And here is the .NET code I am using to read this table: 这是我用来读取此表的.NET代码:

OleDbConnection connection = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + args[0]);
connection.Open();
OleDbCommand command = new OleDbCommand("SELECT * FROM [table1]", connection);
OleDbDataReader reader = command.ExecuteReader();
while (reader.Read())
{
    for (i = 0, j = reader.FieldCount; i < j; i++)
    {
        System.Diagnostics.Debug.Print(reader.GetValue(i));
    }
}

Originally, I was writing the data to a text file using StreamWriter . 最初,我使用StreamWriter将数据写入文本文件。 I noticed that the euro symbol was written as € which probably is the unicode euro symbol encoded in UTF-8. 我注意到欧元符号被写为€ ,这可能是以UTF-8编码的unicode欧元符号。 Debugger results: 调试器结果:

reader.GetValue(i).ToString()                  -> "€"
reader.GetValue(i).ToString().ToCharArray()[0] -> 8364 '€'

How can I enforce .NET to 我该如何强制执行.NET spit out 吐出 output the extended ASCII characters as-is? 按原样输出扩展的ASCII字符? The characters are supposed to be written in a CSV file. 这些字符应该写在CSV文件中。

The fact that these two lines: 这两条线的事实:

reader.GetValue(i).ToString()                  -> "€"
reader.GetValue(i).ToString().ToCharArray()[0] -> 8364 '€'

do what you want tells me we can stop looking at data-access and MS Access, 'cos that is all working fine. 做你想做的事情告诉我,我们可以停止查看数据访问和MS Access,'因为这一切都很好。 The problem is simply: writing that to a file. 问题很简单:将其写入文件。 The trick, then, is to be explicit when you create the StreamWriter . 因此,当您创建StreamWriter时,要明确的是。 If you look at the StreamWriter constructors, you'll see that some take an Encoding . 如果你看一下StreamWriter构造函数,你会发现有些构建器采用了Encoding If you leave it blank, it will default to UTF-8 . 如果将其留空, 则默认为UTF-8 So: don't leave it blank. 所以:不要把它留空。 Explicitly pass in your chosen Encoding . 明确传入您选择的Encoding I would recommend you figure out exactly which code-page you mean, and use: 我建议你弄清楚你的意思哪个代码页,并使用:

const int CodePage = ....; // TODO: only you know this
var enc = Encoding.GetEncoding(CodePage);
using(var file = File.Create(path))
using(var writer = new StreamWriter(file, enc)) {
   ... // write the contents
}

You could also use Encoding.Default (the system's default ANSI code-page), but that is a bit hit and miss. 您也可以使用Encoding.Default (系统的默认ANSI代码页),但这有点受欢迎。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM