简体   繁体   中英

How to Convert wchar wstring in C++ to something better supported?

I am a Java developer, and I have run into compiling problems with Android NDK compiling C++ classes that have wchar and wstring etc. After checking if anything might support these, my findings so far indicates that nothing fully supports these on NDK. This means I need to change them in the source. How exactly can this be done? Thanks

The best approach is to re-write as much as possible in Java :)

But wchar and friends are basically just "zero-terminated arrays that have 16-bit instead of 8-bit characters". Microsoft libraries just punt by having parallel versions of strcpy()/wstrcpy(), strlen()/wstrlen(), etc. It should be fairly straightforward to identify where wchar's are being used, and implement those few simple functions you might need yourself, shouldn't it?

Ok, besides flagging the duplicate, I found this interesting article:

TL;DR We elected to extract the core wide/narrow conversion routines from the Android implementation of JNI in the Android open source project so the conversion runs entirely in native code

Wide and Narrow Character String Conversions

This is quite a complex issue when porting an application because of the multiple methods and standards in existence. Windows Mobile (Windows CE) standardized on the two byte per character unit UTF-16 and with rare exception, the ANSI or one byte per character unit native APIs were eliminated. The C# language and .NET Compact Framework also utilize UTF-16.

The Linux and Android native API's rely on single byte per character unit, null terminated strings. A wide C++ character on Linux is 4 bytes per character as opposed to 2 bytes per character unit on a Microsoft platform. One effect is to double the length of all wide character strings including string literals preceded by the L character.

One possibility is to translate UTF-16 including surrogate pairs to UTF-8 multi-byte strings which can require one to four bytes for each character, and can contain embedded zero bytes. The Java Native Interface (JNI) provides routines to translate Java UTF-16 into “Modified” UTF-8. The modifications result in a narrow character string containing no embedded zeros, only the zero at the end of the string. Another modification is to translate a four byte UTF-16 surrogate pair into two UTF-8 characters, each three bytes long instead of a single UTF-8 character, four bytes long.

The end result of using JNI routines to translate between wide and narrow strings is that the wide UTF-16 string format is compatible both with Java and Windows Mobile (CE) and the narrow Modified UTF-8 string is compatible with Android / Linux OS API and C run time library.

he Android C run time library (Bionic) contains a wchar.h to implement functions such as wcslen, wcscpy, etc., but as noted in the comments in the header, no actual wide char functions are implemented in the Android C run time library. We resolve this by using the GNU C++ compiler option “-fshort-wchar” which forces the compiler to treat wide characters as two bytes instead of four bytes. This makes the L”string” literal two bytes per character and compatible with UTF-16. We have extracted the actual wide character run time library from the Wine open source project.

It is possible to use JNI as delivered in Android to translate between native C++ wide and narrow strings. This involves a round trip through the Java environment and thus is not very efficient. We elected to extract the core wide/narrow conversion routines from the Android implementation of JNI in the Android open source project so the conversion runs entirely in native code .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM