简体   繁体   English

如何在我的NSString中正确编码Unicode字符?

[英]How do I properly encode Unicode characters in my NSString?

Problem Statement 问题陈述

I create a number of strings, concatenate them together into CSV format, and then email the string as an attachment. 我创建了许多字符串,将它们连接在一起为CSV格式,然后通过电子邮件将字符串作为附件发送。

When these strings contain only ASCII characters, the CSV file is built and emailed properly. 当这些字符串仅包含ASCII字符时,将正确构建CSV文件并通过电子邮件发送该文件。 When I include non-ASCII characters, the result string becomes malformed and the CSV file is not created properly. 当我包含非ASCII字符时,结果字符串格式错误,并且CSV文件未正确创建。 (The email view shows an attachment, but it is not sent.) (电子邮件视图显示附件,但未发送。)

For instance, this works: 例如,这有效:

uncle bill's house of pancakes

But this doesn't (note the curly apostrophe): 但这不是(请注意大括号):

uncle bill’s house of pancakes

Question

How do I create and encode the final string properly so that all valid unicode characters are included and the result string is formed properly? 如何正确创建和编码最终字符串,以便包括所有有效的unicode字符并正确形成结果字符串?

Notes 笔记

  • The strings are created via a UITextField and then are written to and then read from a Core Data store. 字符串是通过UITextField创建的,然后写入Core Data存储,然后从Core Data存储中读取。

  • This suggests that the problem lies in the initial creation and encoding of the string: NSString unicode encoding problem 这表明问题出在字符串的初始创建和编码上: NSString unicode编码问题

  • I don't want to have to do this: remove non ASCII characters from NSString in objective-c 我不想这样做: 从Objective-C中的NSString中删除非ASCII字符

  • The strings are written and read to/from the data store fine. 可以将字符串写入数据存储中或从中读取。 The strings display properly (individually) in the app's table views. 字符串在应用程序的表视图中正确显示(单独)。 The problem only manifests when concatenating the strings together for the email attachment. 仅当将字符串串联在一起作为电子邮件附件时,问题才显现出来。

String Processing Code 字符串处理代码

I concatenate my strings together like this: 我将字符串串联在一起,如下所示:

[reportString appendFormat:@"%@,", category];
[reportString appendFormat:@"%@,", client];
[reportString appendFormat:@"%@\n", detail];
etc.

Replacing curly quotes with boring quotes makes it work, but I don't want to do it this way: 用无聊的引号代替卷曲的引号可以使它工作,但我不想这样:

- (NSMutableString *)cleanString:(NSString *)activity {
    NSString *temp1 = [activity stringByReplacingOccurrencesOfString:@"’" withString:@"'"];
    NSString *temp2 = [temp1 stringByReplacingOccurrencesOfString:@"‘" withString:@"'"];
    NSString *temp3 = [temp2 stringByReplacingOccurrencesOfString:@"”" withString:@"\""];
    NSString *temp4 = [temp3 stringByReplacingOccurrencesOfString:@"“" withString:@"\""];
    return [NSMutableString temp4];
}

Edit: The email is sent: 编辑:电子邮件已发送:

    NSString *attachment = [self formatReportCSV];
    [picker addAttachmentData:[attachment dataUsingEncoding:NSStringEncodingConversionAllowLossy] mimeType:nil fileName:@"MyCSVFile.csv"];

where formatReportCSV is the function that concatenates and returns the csv string. 其中formatReportCSV是连接并返回csv字符串的函数。

You seem to be running across a string encoding issue. 您似乎遇到了字符串编码问题。 Without seeing what your Core Data model looks like, I'd assume the issue boils down to the issue reproduced by the code below. 在没有看到您的Core Data模型是什么样的情况下,我认为问题归结为以下代码所重现的问题。

NSString *string1 = @"Uncle bill’s house of pancakes.";
NSString *string2 = @" Appended with some garbage's stuff.";
NSMutableString *mutableString = [NSMutableString stringWithString: string1];
[mutableString appendString: string2];
NSLog(@"We got: %@", mutableString);
// We got: Uncle bill’s house of pancakes. Appended with some garbage's stuff.

NSData *storedVersion = [mutableString dataUsingEncoding: NSStringEncodingConversionAllowLossy];
NSString *restoredString = [[NSString alloc] initWithData: storedVersion encoding: NSStringEncodingConversionAllowLossy];
NSLog(@"Restored string with NSStringEncodingConversionAllowLossy: %@", restoredString);
// Restored string with NSStringEncodingConversionAllowLossy: 

storedVersion = [mutableString dataUsingEncoding: NSUTF8StringEncoding];
restoredString = [[NSString alloc] initWithData: storedVersion encoding: NSUTF8StringEncoding];
NSLog(@"Restored string with UTF8: %@", restoredString);
// Restored string with UTF8: Uncle bill’s house of pancakes. Appended with some garbage's stuff.

Note how the first string (encoded using ASCII) couldn't handle the presence of the non-ASCII character (it can if you use dataUsingEncoding:allowsLossyConversion: with the second parameter being YES ). 请注意,第一个字符串(使用ASCII编码)如何无法处理非ASCII字符的存在(如果您使用dataUsingEncoding:allowsLossyConversion: ,第二个参数为YES )。

This code should fix the issue: 此代码应解决此问题:

NSString *attachment = [self formatReportCSV];
[picker addAttachmentData:[attachment dataUsingEncoding: NSUTF8StringEncoding] mimeType:nil fileName:@"MyCSVFile.csv"];

Note: you may need to use one of the UTF16 string encodings if you need to handle non-UTF8 languages like Japanese. 注意:如果需要处理非UTF8语言(例如日语),则可能需要使用UTF16字符串编码之一。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM