简体   繁体   中英

How to convert hebrew (unicode) to Ascii in c#?

I have to create some sort of text file in which there are numbers and Hebrew letters decoded to ASCII.

This is file creation method which triggers on ButtonClick

protected void ToFile(object sender, EventArgs e)
{
    filename = Transactions.generateDateYMDHMS();
    string path = string.Format("{0}{1}.001", Server.MapPath("~/transactions/"), filename);
    StreamWriter sw = new StreamWriter(path, false, Encoding.ASCII);
    sw.WriteLine("hello");
    sw.WriteLine(Transactions.convertUTF8ASCII("שלום"));
    sw.WriteLine("bye");
    sw.Close();
}

as you can see, i use Transactions.convertUTF8ASCII() static method to convert from probably Unicode string from .NET to ASCII representation of it. I use it on term Hebrew 'shalom' and get back '????' instead of result i need.

Here is the method.

public static string convertUTF8ASCII(string initialString)
{
    byte[] unicodeBytes = Encoding.Unicode.GetBytes(initialString);
    byte[] asciiBytes = Encoding.Convert(Encoding.Unicode, Encoding.ASCII, unicodeBytes);
    return Encoding.ASCII.GetString(asciiBytes);
}

Instead of having initial word decoded to ASCII i get '????' in the file i create even if i run debbuger i get same result.

What i'm doing wrong ?

You can't simply translate arbitrary unicode characters to ASCII. The best it can do is discard the unsupportable characters, hence ???? . Obviously the basic 7-bit characters will work, but not much else. I'm curious as to what the expected result is?

If you need this for transfer (rather than representation) you might consider base-64 encoding of the underlying UTF8 bytes.

Do you perhaps mean ANSI, not ASCII?

ASCII doesn't define any Hebrew characters. There are however some ANSI code pages which do such as "windows-1255"

In which case, you may want to consider looking at: http://msdn.microsoft.com/en-us/library/system.text.encoding.aspx

In short, where you have:

Encoding.ASCII

You would replace it with:

Encoding.GetEncoding(1255)

I just faced the same issue when original xml file was in ASCII Encoding.

As Userx suggested

Encoding.GetEncoding(1255)

XDocument.Parse(System.IO.File.ReadAllText(xmlPath, Encoding.GetEncoding(1255)));

So now my XDocument file can read hebrew even if the xml file was saved as ASCII

如果您确实在谈论ASCII,您是否在问音译 (如“ 罗马化 ”中的方法)而不是编码转换?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM