简体   繁体   English

NSString UTF8String改写unicode字符

[英]NSString UTF8String mangling unicode characters

When I run [NSString UTF8String] on certain unicode characters the resulting const char* representation is mangled both in NSLog and on the device/simulator display. 当我在某些Unicode字符上运行[NSString UTF8String] ,在NSLog和设备/模拟器显示中都会破坏生成的const char*表示形式。 The NSString itself displays fine but I need to convert the NSString to a cStr to use it in CGContextShowTextAtPoint . NSString本身显示正常,但我需要将NSString转换为cStr才能在CGContextShowTextAtPoint使用它。

It's very easy to reproduce (see code below) but I've searched for similar questions without any luck. 复制非常容易(请参见下面的代码),但是我一直在寻找类似的问题,但没有任何运气。 Must be something basic I'm missing. 一定是我所缺少的基本知识。

const char *cStr = [@"章" UTF8String];
NSLog(@"%s", cStr); 

Thanks! 谢谢!

CGContextShowTextAtPoint is only for ASCII chars. CGContextShowTextAtPoint仅适用于ASCII字符。

Check this SO question for answers. 检查 SO问题以获取答案。

When using the string format specifier (aka %s) you cannot be guaranteed that the characters of ac string will print correctly if they are not ASCII. 使用字符串格式说明符(也称为%s)时,如果不是ASCII,则不能保证ac字符串的字符将正确打印。 Using a complex character as you've defined can be expressed in UTF-8 using escape characters to indicate the character set from which the character can be found. 可以在UTF-8中使用定义的复杂字符来表示,使用转义字符表示可以从中找到该字符的字符集。 However the %s uses the system encoding to interpret the characters in the character string you provide to the formatting ( in this case, in NSLog ). 但是,%s使用系统编码来解释您提供给格式的字符串中的字符(在本例中为NSLog)。 See Apple's documentation: 请参阅Apple的文档:

https://developer.apple.com/library/mac/documentation/cocoa/Conceptual/Strings/Articles/formatSpecifiers.html https://developer.apple.com/library/mac/documentation/cocoa/Conceptual/Strings/Articles/formatSpecifiers.html

%s Null-terminated array of 8-bit unsigned characters. %s空终止的8位无符号字符数组。 %s interprets its input in the system encoding rather than, for example, UTF-8. %s以系统编码而不是UTF-8解释其输入。

Going onto you CGContextShowTextAtPoint not working, that API supports only the macRoman character set, which is not the entire Unicode character set. 转到CGContextShowTextAtPoint无效,该API仅支持macRoman字符集,而不是整个Unicode字符集。

Youll need to look into another API for showing Unicode characters. 您需要查看另一个用于显示Unicode字符的API。 Probably Core Text is where you'll want to start. 您可能想从Core Text开始。

I've never noticed this issue before, but some quick experimentation shows that using printf instead of NSLog will cause the correct Unicode character to show up. 我之前从未注意到过此问题,但是一些快速实验表明,使用printf而不是NSLog会显示正确的Unicode字符。

Try: 尝试:

printf("%s", cStr);

This gives me the desired output ("章") both in the Xcode console and in Terminal. 这在Xcode控制台和Terminal中都为我提供了所需的输出(“章”)。 As nob1984 stated in his answer, the interpretation of the character data is up to the callee. 正如nob1984在他的回答中所指出的那样,字符数据的解释取决于被呼叫者。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM