[英]Converting hex value to utf-8 character
I'm using IMAP class to read emails. 我正在使用IMAP类来阅读电子邮件。 When my mail body contains Ö IMAP returns the hex value:
=C3=96
. 当我的邮件正文包含ÖIMAP时,返回十六进制值:
=C3=96
。 How do I convert it to an utf-8 Ö? 如何将其转换为utf-8Ö?
I'm thinking something like : 我在想类似的东西:
Encoding enc = Encoding.GetEncoding("UTF-8);
System.Byte[] ch = new System.Byte[1];
ch[0] = System.Convert.ToByte([hex value of Ö], 16);
var decodedItem = enc.GetString(ch);
Where expected value of decodedItem is Ö. 其中,decodedItem的期望值为Ö。 But I don't really know why Ö translates to
=C3=96
in IMAP and I can't send that in to ToByte()
because =C3=96
isnt a true hex value. 但是我真的不知道为什么Ö在IMAP中会转换为
=C3=96
,并且我无法将其发送到ToByte()
因为=C3=96
并不是真正的十六进制值。
I've also tried doing this: 我也尝试这样做:
Encoding enc = Encoding.GetEncoding("UTF-8);
System.Byte[] ch = new System.Byte[1];
ch[0] = 214;
var decodedItem = enc.GetString(ch);
But the value in decodedItem is = 但是,decodedItem中的值为=。
That symbol is actually two bytes (0xC3, 0x96), but you're only assigning one, and a different one at that (214 = 0xD6)... 该符号实际上是两个字节(0xC3、0x96),但是您只分配一个字节,而在此分配一个不同的字节(214 = 0xD6)...
Encoding enc = Encoding.GetEncoding("UTF-8");
System.Byte[] ch = { 0xC3, 0x96 };
var decodedItem = enc.GetString(ch);
To clarify a bit further, 0xD6 (214) is actually for Unicode, not UTF-8, and you'd reach it by changing the call and values to match the Unicode value: 为了进一步说明,0xD6(214)实际上是针对Unicode而非UTF-8的,您可以通过更改调用和值以使其与Unicode值匹配来实现:
Encoding enc = Encoding.GetEncoding("Unicode");
System.Byte[] ch = { 0xD6, 0x00 };
http://www.utf8-chartable.de/ U+00D6 Ö c3 96 LATIN CAPITAL LETTER O WITH DIAERESIS http://www.utf8-chartable.de/ U + 00D6Öc3 96带有拉丁字母的拉丁文大写字母O
This means you have to take away the '=' and then convert it to UTF 8 这意味着您必须删除'=',然后将其转换为UTF 8
I hope this helps. 我希望这有帮助。
Greetings Alex 问候亚历克斯
There's no Unicode in most of today's e-mails. 当今大多数电子邮件中都没有Unicode。 In order to arrive to a Unicode text, you have to do the following operations:
为了到达Unicode文本,您必须执行以下操作:
BODYSTRUCTURE
in RFC 3501. BODYSTRUCTURE
。 BODYSTRUCTURE
response) to find out the Content-Transfer-Encoding
of the part that you're looking at. BODYSTRUCTURE
响应)以查找正在查看的部分的Content-Transfer-Encoding
。 Most common encodings are quoted-printable
and base64
. quoted-printable
和base64
。 Look at RFC 2045, 2046, 2047 and 2048 for details. Content-Transfer-Encoding
so that you arrive at a bytestream which contains a sequence of bytes. Content-Transfer-Encoding
以便到达包含字节序列的字节流。 Content-Type
header, the charset
parameter. Content-Type
标头,即charset
参数。 Alternatively, use a library which implements these functions in your favorite language/framework. 或者,使用以您喜欢的语言/框架实现这些功能的库。 There are plenty of them.
有很多。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.