将十六进制值转换为utf-8字符

Question

I'm using IMAP class to read emails. 我正在使用IMAP类来阅读电子邮件。 When my mail body contains Ö IMAP returns the hex value: =C3=96 . 当我的邮件正文包含ÖIMAP时，返回十六进制值： =C3=96 。 How do I convert it to an utf-8 Ö? 如何将其转换为utf-8Ö？

I'm thinking something like : 我在想类似的东西：

Encoding enc = Encoding.GetEncoding("UTF-8);
System.Byte[] ch = new System.Byte[1];

ch[0] = System.Convert.ToByte([hex value of Ö], 16);
var decodedItem = enc.GetString(ch);

Where expected value of decodedItem is Ö. 其中，decodedItem的期望值为Ö。 But I don't really know why Ö translates to =C3=96 in IMAP and I can't send that in to ToByte() because =C3=96 isnt a true hex value. 但是我真的不知道为什么Ö在IMAP中会转换为=C3=96 ，并且我无法将其发送到ToByte()因为=C3=96并不是真正的十六进制值。

I've also tried doing this: 我也尝试这样做：

Encoding enc = Encoding.GetEncoding("UTF-8);
System.Byte[] ch = new System.Byte[1];

ch[0] = 214;
var decodedItem = enc.GetString(ch);

But the value in decodedItem is = 但是，decodedItem中的值为=。

Answer 1

That symbol is actually two bytes (0xC3, 0x96), but you're only assigning one, and a different one at that (214 = 0xD6)... 该符号实际上是两个字节（0xC3、0x96），但是您只分配一个字节，而在此分配一个不同的字节（214 = 0xD6）...

Encoding enc = Encoding.GetEncoding("UTF-8");
System.Byte[] ch = { 0xC3, 0x96 };

var decodedItem = enc.GetString(ch);

To clarify a bit further, 0xD6 (214) is actually for Unicode, not UTF-8, and you'd reach it by changing the call and values to match the Unicode value: 为了进一步说明，0xD6（214）实际上是针对Unicode而非UTF-8的，您可以通过更改调用和值以使其与Unicode值匹配来实现：

Encoding enc = Encoding.GetEncoding("Unicode");
System.Byte[] ch = { 0xD6, 0x00 };

Answer 2

http://www.utf8-chartable.de/ U+00D6 Ö c3 96 LATIN CAPITAL LETTER O WITH DIAERESIS http://www.utf8-chartable.de/ U + 00D6Öc3 96带有拉丁字母的拉丁文大写字母O

This means you have to take away the '=' and then convert it to UTF 8 这意味着您必须删除'='，然后将其转换为UTF 8

I hope this helps. 我希望这有帮助。

Greetings Alex 问候亚历克斯

Answer 3

There's no Unicode in most of today's e-mails. 当今大多数电子邮件中都没有Unicode。 In order to arrive to a Unicode text, you have to do the following operations: 为了到达Unicode文本，您必须执行以下操作：

Find a textual part of the message. 查找消息的文本部分。 There could be many of them. 可能有很多。 See the BODYSTRUCTURE in RFC 3501. 请参阅RFC 3501中的BODYSTRUCTURE 。
Inspect the MIME headers (or the BODYSTRUCTURE response) to find out the Content-Transfer-Encoding of the part that you're looking at. 检查MIME标头（或BODYSTRUCTURE响应）以查找正在查看的部分的Content-Transfer-Encoding 。 Most common encodings are quoted-printable and base64 . 最常见的编码是quoted-printable和base64 。 Look at RFC 2045, 2046, 2047 and 2048 for details. 有关详细信息，请参见RFC 2045、2046、2047和2048。
Undo the Content-Transfer-Encoding so that you arrive at a bytestream which contains a sequence of bytes. 撤消Content-Transfer-Encoding以便到达包含字节序列的字节流。
Look at the Content-Type header, the charset parameter. 查看Content-Type标头，即charset参数。
Decode the stream of bytes using a codec/charset/... which you find above. 使用上面找到的编解码器/字符集/ ...解码字节流。
Congratulations, you now have your Unicode string. 恭喜，您现在有了Unicode字符串。

Alternatively, use a library which implements these functions in your favorite language/framework. 或者，使用以您喜欢的语言/框架实现这些功能的库。 There are plenty of them. 有很多。

将十六进制值转换为utf-8字符

问题描述

3 个解决方案

解决方案1
2 已采纳 2015-02-26 13:07:28

解决方案2
1 2015-02-26 13:07:16

解决方案3
1 2015-03-02 12:01:11

将十六进制值转换为utf-8字符

问题描述

3 个解决方案

解决方案1 2 已采纳 2015-02-26 13:07:28

解决方案2 1 2015-02-26 13:07:16

解决方案3 1 2015-03-02 12:01:11

解决方案1
2 已采纳 2015-02-26 13:07:28

解决方案2
1 2015-02-26 13:07:16

解决方案3
1 2015-03-02 12:01:11