简体   繁体   English

特殊字符的ICQ编码

[英]ICQ encoding of Special Characters

I'm working with ICQ protocol and I found problem with special letters (fxp diacritics). 我正在使用ICQ协议,但发现特殊字母(fxp变音符号)有问题。 I read that ICQ using another encoding (CP-1251 if I remember). 我使用另一种编码(如果记得,还可以使用CP-1251)阅读该ICQ。

How can I decode string with text to correct encoding? 如何解码带有文本的字符串以更正编码?

I've tried using UTF8Encoding class, but without success. 我尝试使用UTF8Encoding类,但没有成功。

Using ICQ-sharp library. 使用ICQ-sharp库。

    private void ParseMessage (string uin, byte[] data)
    {
        ushort capabilities_length = LittleEndianBitConverter.Big.ToUInt16 (data, 2);
        ushort msg_tlv_length = LittleEndianBitConverter.Big.ToUInt16 (data, 6 + capabilities_length);
        string message = Encoding.UTF8.GetString (data, 12 + capabilities_length, msg_tlv_length - 4);

        Debug.WriteLine(message);
    }

If contact using the same client it's OK, but if not incoming and outcoming messages with diacritics are just unreadable. 如果使用相同的客户端进行联系就可以了,但是如果不是这样,则带有变音符号的传入和传出消息就无法读取。

I've determinated (using this -> https://stackoverflow.com/a/12853721/846232 ) that it's in BigEndianUnicode encoding. 我已经确定(使用此-> https://stackoverflow.com/a/12853721/846232 )它是BigEndianUnicode编码的。 But if string not contains diacritics its unreadable (chinese letters). 但是,如果字符串不包含变音符号,则其不可读(中文字母)。 But if I use UTF8 encoding on text without diacritics its ok. 但是如果我在不带变音符号的文本上使用UTF8编码就可以了。 But I don't know how to do that it will be encoded right allways. 但是我不知道该怎么做,否则它将一直被编码。

If UTF-8 kinda works (ie it works for "english", or any US-ASCII characters), then you don't have UTF-16. 如果UTF-8有点用(即,它适用于“英语”或任何US-ASCII字符),则说明您没有UTF-16。 Latin1 (or Windows-1252, Microsoft's variant), or eg Windows-1251 or Windows-1250 are perfectly possible though, since these the first part containing latin letters without diacritics are the same. 但是,完全可以使用Latin1(或Windows-1252,Microsoft的变体),例如Windows-1251或Windows-1250,因为这些包含拉丁字母但不带变音符号的第一部分是相同的。

Decode like this: 像这样解码:

var encoding = Encoding.GetEncoding("Windows-1250");
string message = encoding.GetString(data, 12 + capabilities_length, msg_tlv_length - 4);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM