简体   繁体   English

斯堪的纳维亚字母从url到utf8到iPhone上的const char的字符串编码

[英]String encoding of scandinavian letter from url to utf8 to const char on iphone

NSString *theString = @"a %C3%B8 b";

NSLog(@"%@", theString);

NSString *utf8string = [theString stringByReplacingPercentEscapesUsingEncoding: NSUTF8StringEncoding]

NSLog(@"%@", utf8string);

const char *theChar = [utf8string UTF8String];

NSLog(@"%s", theChar);

This logs the following: 这将记录以下内容:

'a %C3%B8 b' 'a%C3%B8 b'

'a ø b' 'aøb'

'a √∏ b' 'a √∏ b'

The problem is that I want theChar to be 'a ø b'. 问题是我希望Char为'aøb'。 Any help on how to achieve that would be greatly appreciated. 对于如何实现这一目标的任何帮助将不胜感激。

I don't think you can. 我认为你不能。 char is a eight bit type so all values are between 0-255. char是八位类型,因此所有值都在0-255之间。 In UTF8 the ø is not encoded in that range. 在UTF8中,ø不在该范围内编码。

You might want to look at the unicode type which is a 16 bit type. 您可能要看一下16位类型的unicode类型。 This can hold the ø as one item and use getCharacters:range: to get the characters out of the NSString 这可以将ø作为一项保留,并使用getCharacters:range:从NSString中获取字符。

From String Format Specifiers in String Programming Guide: 来自《 字符串编程指南》中的字符串格式说明符

%s : Null-terminated array of 8-bit unsigned characters. %s :以8位无符号字符为结尾的空数组。 %s interprets its input in the system encoding rather than, for example, UTF-8. %s以系统编码而不是UTF-8解释其输入。

So NSLog(@"%s", theChar) creates and displays NSString object with wrong encoding and theChar itself contains correct string data. 因此, NSLog(@"%s", theChar)创建并显示编码错误的NSString对象,并且theChar本身包含正确的字符串数据。

NSLog([NSString stringWithUTF8String:theChar]);

Gives the correct output. 提供正确的输出。 (a ø b) (aøb)

I'd like to add that your theChar does contain the UTF8 byte sequence of your desired string. 我想补充一点,您的theChar 确实包含所需字符串的UTF8字节序列。 It's the problem of NSLog("%s") that it can't show the string correctly into the log file and/or the console. NSLog("%s")在于它无法在日志文件和/或控制台中正确显示字符串。

So, if you want to pass the UTF8 byte sequence in char* to some other library, what you did is perfectly correct. 因此,如果要将char*的UTF8字节序列传递给其他库,则您所做的完全正确。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM