简体   繁体   English

字符串编码和表情符号出现问题

[英]Trouble with string encoding and emoji

I've got some trouble to retrieve some text message from my server, especially with the encoding. 我在从服务器检索某些文本消息时遇到了一些麻烦,尤其是在编码方面。 Messages can be from many languages (so they can have accents, be in japanese,... ) and can include emoji. 消息可以来自多种语言(因此,它们可以带有重音符号,日语),并且可以包含表情符号。

I'm retrieving my message with a JSON with some info. 我正在使用带有一些信息的JSON检索消息。 Here is some logs example : 这是一些日志示例:

(lldb) po dataMessages
<__NSCFArray 0x14ecc7f0>(
{
    author = "User 1";
    text = "Hier, c'\U00c3\U00a9tait incroyable";
},
{
...
}
)

(lldb) po [[dataMessages objectAtIndex:0] objectForKey:@"text"]
Hier, c'était incroyable

I'm able to get the correct text with : 我可以使用来获取正确的文本:

const char *c = [[[dataMessages objectAtIndex:indexPath.row] objectForKey:@"text"] cStringUsingEncoding:NSWindowsCP1252StringEncoding];
NSString *myMessage = [NSString stringWithCString:c encoding:NSUTF8StringEncoding];

However, if the message contains emoji, cStringUsingEncoding: return a NULL value. 但是,如果消息包含表情符号,则cStringUsingEncoding:返回NULL值。
I don't have control on my server, so I can't change their encoding before messages are sent to me. 我在服务器上没有控制权,因此在向我发送消息之前,我无法更改其编码。

The problem is determining the encoding correctly. 问题是正确确定编码。 Emoji are not part of NSWindowsCP1252StringEncoding so the conversion just fails. 表情符号不是NSWindowsCP1252StringEncoding的一部分,因此转换失败。

Moreover, you are passing through an unnecessary stage. 而且,您正在经历不必要的阶段。 Do not make an intermediate C string! 不要制作中间的C字符串! Just call NSString's initWithData:encoding: . 只需调用NSString的initWithData:encoding:

In your case, calling NSWindowsCP1252StringEncoding was always a mistake; 就您而言,调用NSWindowsCP1252StringEncoding总是一个错误; I'm surprised that this worked for any string. 我很惊讶这对于任何字符串都有效。 C3A9 is Unicode (UTF8). C3A9是Unicode(UTF8)。 So just call initWithData:encoding: with the UTF8 encoding (NSUTF8StringEncoding) from the get-go and all will be well. 因此,只要从一开始就使用intWithUTF8编码(NSUTF8StringEncoding)调用initWithData:encoding:就可以了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM