简体   繁体   English

如何在c#中将希伯来语(unicode)转换为ascii?

[英]How to convert hebrew (unicode) to Ascii in c#?

I have to create some sort of text file in which there are numbers and Hebrew letters decoded to ASCII. 我必须创建某种文本文件,其中将数字和希伯来字母解码为ASCII。

This is file creation method which triggers on ButtonClick 这是在ButtonClick上触发的文件创建方法

protected void ToFile(object sender, EventArgs e)
{
    filename = Transactions.generateDateYMDHMS();
    string path = string.Format("{0}{1}.001", Server.MapPath("~/transactions/"), filename);
    StreamWriter sw = new StreamWriter(path, false, Encoding.ASCII);
    sw.WriteLine("hello");
    sw.WriteLine(Transactions.convertUTF8ASCII("שלום"));
    sw.WriteLine("bye");
    sw.Close();
}

as you can see, i use Transactions.convertUTF8ASCII() static method to convert from probably Unicode string from .NET to ASCII representation of it. 如您所见,我使用Transactions.convertUTF8ASCII()静态方法将可能的Unicode字符串从.NET转换为它的ASCII表示形式。 I use it on term Hebrew 'shalom' and get back '????' 我在希伯来语“ shalom”一词上使用它,然后取回“ ????” instead of result i need. 而不是结果,我需要。

Here is the method. 这是方法。

public static string convertUTF8ASCII(string initialString)
{
    byte[] unicodeBytes = Encoding.Unicode.GetBytes(initialString);
    byte[] asciiBytes = Encoding.Convert(Encoding.Unicode, Encoding.ASCII, unicodeBytes);
    return Encoding.ASCII.GetString(asciiBytes);
}

Instead of having initial word decoded to ASCII i get '????' 而不是将初始单词解码为ASCII,我得到了'????' in the file i create even if i run debbuger i get same result. 在我创建的文件中,即使运行debbuger我也得到相同的结果。

What i'm doing wrong ? 我做错了什么?

You can't simply translate arbitrary unicode characters to ASCII. 您不能简单地将任意Unicode字符转换为ASCII。 The best it can do is discard the unsupportable characters, hence ???? 最好的办法是丢弃不支持的字符,因此???? . Obviously the basic 7-bit characters will work, but not much else. 显然,基本的7位字符可以使用,但其他功能不多。 I'm curious as to what the expected result is? 我很好奇预期的结果是什么?

If you need this for transfer (rather than representation) you might consider base-64 encoding of the underlying UTF8 bytes. 如果需要进行传输 (而不是表示),则可以考虑基础UTF8字节的base-64编码。

Do you perhaps mean ANSI, not ASCII? 您也许是指ANSI,而不是ASCII?

ASCII doesn't define any Hebrew characters. ASCII没有定义任何希伯来字符。 There are however some ANSI code pages which do such as "windows-1255" 但是,有些ANSI代码页会执行这些操作,例如“ windows-1255”

In which case, you may want to consider looking at: http://msdn.microsoft.com/en-us/library/system.text.encoding.aspx 在这种情况下,您可能需要考虑查看: http : //msdn.microsoft.com/zh-cn/library/system.text.encoding.aspx

In short, where you have: 简而言之,您在哪里:

Encoding.ASCII

You would replace it with: 您可以将其替换为:

Encoding.GetEncoding(1255)

I just faced the same issue when original xml file was in ASCII Encoding. 当原始xml文件采用ASCII编码时,我只是遇到了同样的问题。

As Userx suggested 如用户建议

Encoding.GetEncoding(1255) Encoding.GetEncoding(1255)

XDocument.Parse(System.IO.File.ReadAllText(xmlPath, Encoding.GetEncoding(1255)));

So now my XDocument file can read hebrew even if the xml file was saved as ASCII 因此,即使XML文件另存为ASCII,现在我的XDocument文件也可以读取希伯来语

如果您确实在谈论ASCII,您是否在问音译 (如“ 罗马化 ”中的方法)而不是编码转换?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM