打印C字符串时的NSLog（）vs printf（）（UTF-8）

Question

I have noticed that if I try to print the byte array containing the representation of a string in UTF-8, using the format specifier "%s", printf() gets it right but NSLog() gets it garbled (ie, each byte printed as-is, so for example "¥" gets printed as the 2 characters: "¬•"). 我注意到如果我尝试使用格式说明符“％s”打印包含UTF-8字符串表示的字节数组， printf()会正确但NSLog()会使其乱码（即每个字节）按原样打印，例如“¥”打印为2个字符：“¬•”）。 This is curious, because I always thought that NSLog() is just printf() , plus: 这很奇怪，因为我一直以为NSLog()只是printf() ，加上：

The first parameter (the 'format') is an Objective-C string, not a C string (hence the "@"). 第一个参数（'format'）是Objective-C字符串，而不是C字符串（因此是“@”）。
The timestamp and app name prepended. 前缀的时间戳和应用程序名称。
The newline automatically added at the end. 新行自动添加到最后。
The ability to print Objective-C objects (using the format "%@"). 打印Objective-C对象的能力（使用格式“％@”）。

My code: 我的代码：

NSString* string; 

// (...fill string with unicode string...)

const char* stringBytes = [string cStringUsingEncoding:NSUTF8Encoding];

NSUInteger stringByteLength = [string lengthOfBytesUsingEncoding:NSUTF8Encoding];
stringByteLength += 1; // add room for '\0' terminator

char* buffer = calloc(sizeof(char), stringByteLength);

memcpy(buffer, stringBytes, stringByteLength);

NSLog(@"Buffer after copy: %s", buffer);
// (renders ascii, no matter what)

printf("Buffer after copy: %s\n", buffer);
// (renders correctly, e.g. japanese text)

Somehow, it looks as if printf() is "smarter" than NSLog() . 不知何故，看起来printf()比NSLog()更“智能”。 Does anyone know the underlying cause, and if this feature is documented anywhere? 有没有人知道根本原因，以及这个功能是否记录在任何地方？ (Couldn't find) （找不到）

Answer 1

NSLog() and stringWithFormat: seem to expect the string for %s in the "system encoding" (for example "Mac Roman" on my computer): NSLog()和stringWithFormat:似乎期望“系统编码”中的%s的字符串（例如我的计算机上的“Mac Roman”）：

NSString *string = @"¥";
NSStringEncoding enc = CFStringConvertEncodingToNSStringEncoding(CFStringGetSystemEncoding());
const char* stringBytes = [string cStringUsingEncoding:enc];
NSString *log = [NSString stringWithFormat:@"%s", stringBytes];
NSLog(@"%@", log);

// Output: ¥

Of course this will fail if some characters are not representable in the system encoding. 当然，如果某些字符在系统编码中无法表示，则会失败。 I could not find an official documentation for this behavior, but one can see that using %s in stringWithFormat: or NSLog() does not reliably work with arbitrary UTF-8 strings. 我找不到这种行为的官方文档，但可以看到在stringWithFormat:中使用%s stringWithFormat:或NSLog()不能可靠地使用任意UTF-8字符串。

If you want to check the contents of a char buffer containing an UTF-8 string, then this would work with arbitrary characters (using the boxed expression syntax to create an NSString from a UTF-8 string): 如果要检查包含UTF-8字符串的char缓冲区的内容，则可以使用任意字符（使用盒装表达式语法从UTF-8字符串创建NSString ）：

NSLog(@"%@", @(utf8Buffer));

打印C字符串时的NSLog（）vs printf（）（UTF-8）

问题描述

1 个解决方案

解决方案1
2 已采纳 2014-05-15 08:23:57

打印C字符串时的NSLog（）vs printf（）（UTF-8）

问题描述

1 个解决方案

解决方案1 2 已采纳 2014-05-15 08:23:57

解决方案1
2 已采纳 2014-05-15 08:23:57