简体   繁体   English

Unicode到ASCII以及umlats的字符转换

[英]Unicode to ASCII with character translations for umlats

I have a client that sends unicode input files and demands only ASCII encoded files in return - why is unimportant. 我有一个客户端,它发送unicode输入文件,并且只要求返回ASCII编码的文件-为什么不重要。

Does anyone know of a routine to translate unicode string to a closest approximation of an ASCII string? 有谁知道将unicode字符串转换为与ASCII字符串最接近的例程的例程吗? I'm looking to replace common unicode characters like 'ä' to a best ASCII representation. 我正在寻找将常见的Unicode字符(如“ä”)替换为最佳的ASCII表示形式。

For example: 'ä' -> 'a' 例如:'ä'->'a'

Data resides in SQL Server however I can also work in C# as a downstream mechanism or as a CLR procedure. 数据驻留在SQL Server中,但是我也可以作为下游机制或CLR过程在C#中工作。

Just loop through the string. 只需遍历字符串即可。 For each character do a switch: 为每个字符进行切换:

switch(inputCharacter)
{
    case 'ä':
      outputString = "ae";
      break;
    case 'ö':
      outputString = "oe";
      break;
...

(These translations are common in german language with ASCII only) (这些翻译仅在德语和ASCII中通用)

Then combine all outputStrings with a StringBuilder. 然后将所有outputString与StringBuilder组合。

I think you really mean extended ASCII to ASCII 我认为您的意思是将ASCII扩展为ASCII
Just a simple dictionary 只是一个简单的字典

Dictionary<char, char> trans = new Dictionary<char, char>() {...}  
StringBuilder sb = new StringBuilder();
foreach (char c in string.ToCharArray)
{
     if((Int)c <= 127) 
         sb.Append(c);
     else
         sbAppend(trans[c]);
}
string ascii = sb.ToString();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM