简体   繁体   中英

How to handle Windows-1252 characters in iOS?

I've done my research, but didn't seem to find a clear anwser.

My question is the following: I have a mysql DB which I convert (with the help of a script) to a sqlite DB. In the original DB (and also in the sqlite) I've found some entries containing the following characters: ‘, ’, ë, ñ, ... (Windows-1252?) There's also some HTML-code.

I've done a test in PHP with the original mysql DB, and the characters showed up nice, as long as I added the content-type utf-8, otherwise I got the same odd characters ‘, ’, ë, ñ.

I've tried the following in iOS:

[[NSString alloc] initWithCString:(const char *) sqlite3_column_text(rs.statement.statement, 4) encoding:NSStringEncodingConversionExternalRepresentation];

[NSString stringWithUTF8String:[[[rs stringForColumn:@"tekst"] stringByDecodingHTMLEntities] cStringUsingEncoding:NSStringEncodingConversionExternalRepresentation]];
  • NSStringEncodingConversionExternalRepresentation
  • NSUTF8StringEncoding
  • NSISO2022JPStringEncoding
  • NSStringEncodingConversionAllowLossy
  • NSWindowsCP1252StringEncoding
  • ...

Then I found this: MWFeedParser NSString+HTML

With these classes I was able to convert the HTML and the ë to ë. The other characters on the other hand, didn't seem to work.

So, how can I convert/encode/decode these characters to display normally?

Did you try to fix the HTML encoding first? I would imagine it got applied last. If you can get a few sample strings, write a quick and dirty test app and try to figure out exactly how the encodings were made.

1) you can try other encodings from this list:

NSISOLatin1StringEncoding = 5,
NSISOLatin2StringEncoding = 9,
NSWindowsCP1251StringEncoding = 11,
NSWindowsCP1252StringEncoding = 12,
NSWindowsCP1253StringEncoding = 13,
NSWindowsCP1254StringEncoding = 14,
NSWindowsCP1250StringEncoding = 15,

2) If Apple does not provide the proper encoding, but it is a known encoding, then you can use iconv(), which is available on both the mac and iOS. It can convert virtually every string encoding to virtually any other one - its a bit complicated to use but you will find lots of examples on the net.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM