简体   繁体   English

处理ASCII文件中的Unicode字符?

[英]Dealing with Unicode characters in an ASCII file?

I have an XML file which I saved as ASCII/UTF-8 using XmlSerializer in C#. 我有一个XML文件,使用C#中的XmlSerializer将其保存为ASCII / UTF-8。 One field contains a folder path location. 一个字段包含一个文件夹路径位置。 I have recently discovered that on non-English language Windows systems, there can be special characters in the path field. 我最近发现,在非英语Windows系统上,路径字段中可能会有特殊字符。 I could save the entire file as Unicode/UTF-16 but that doubles the file size for the sake of a few characters. 我可以将整个文件另存为Unicode / UTF-16,但是为了几个字符,文件大小增加了一倍。

Is there a way to insert non-ASCII characters into an otherwise ASCII string? 是否可以将非ASCII字符插入其他ASCII字符串中?

There's no such thing as ASCII/UTF-8. 没有ASCII / UTF-8这样的东西。 Those are two distinct encodings, that in fact encode different character sets. 那是两种不同的编码,实际上是编码不同的字符集。 I suspect that you are actually using ASCII, or perhaps Windows ANSI, at present. 我怀疑您目前实际上正在使用ASCII或Windows ANSI。

UTF-8 is a complete encoding for Unicode. UTF-8是Unicode的完整编码。 If the file only contains ASCII characters then the UTF-8 encoding is identical to the ASCII encoding. 如果文件仅包含ASCII字符,则UTF-8编码与ASCII编码相同。 And if your files are predominantly English, then UTF-8 is the Unicode encoding that produces the smallest files. 而且,如果您的文件主要是英语,则UTF-8是产生最小文件的Unicode编码。

Conclusion: use UTF-8. 结论:使用UTF-8。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM