简体繁体 English

如何将C ++中的wchar wstring转换为更好的支持？

[英]How to Convert wchar wstring in C++ to something better supported?

原文 2011-11-25 07:08:20 0 2 android/ c++/ android-ndk

I am a Java developer, and I have run into compiling problems with Android NDK compiling C++ classes that have wchar and wstring etc. After checking if anything might support these, my findings so far indicates that nothing fully supports these on NDK. 我是一名Java开发人员，我已经开始编译Android NDK编译具有wchar和wstring等的C ++类的问题。在检查是否有任何内容可能支持这些之后，我的发现到目前为止表明在NDK上没有完全支持这些。 This means I need to change them in the source. 这意味着我需要在源代码中更改它们。 How exactly can this be done? 怎么可以做到这一点？ Thanks 谢谢

2 个解决方案

The best approach is to re-write as much as possible in Java :) 最好的方法是尽可能在Java中重写:)

But wchar and friends are basically just "zero-terminated arrays that have 16-bit instead of 8-bit characters". 但是wchar和朋友基本上只是“具有16位而不是8位字符的零终止数组”。 Microsoft libraries just punt by having parallel versions of strcpy()/wstrcpy(), strlen()/wstrlen(), etc. It should be fairly straightforward to identify where wchar's are being used, and implement those few simple functions you might need yourself, shouldn't it? 微软库通过使用并行版本的strcpy（）/ wstrcpy（），strlen（）/ wstrlen（）等来解决问题。确定使用wchar的位置应该相当简单，并实现您可能需要的几个简单函数，不应该吗？

Ok, besides flagging the duplicate, I found this interesting article: 好的，除了标记副本外，我发现了这篇有趣的文章：

TL;DR We elected to extract the core wide/narrow conversion routines from the Android implementation of JNI in the Android open source project so the conversion runs entirely in native code TL; DR 我们选择从Android开源项目中的JNI的Android实现中提取核心宽/窄转换例程，因此转换完全以本机代码运行

Wide and Narrow Character String Conversions 宽而窄的字符串转换

This is quite a complex issue when porting an application because of the multiple methods and standards in existence. 由于存在多种方法和标准，在移植应用程序时这是一个非常复杂的问题。 Windows Mobile (Windows CE) standardized on the two byte per character unit UTF-16 and with rare exception, the ANSI or one byte per character unit native APIs were eliminated. Windows Mobile（Windows CE）标准化为每个字符单元UTF-16的两个字节，并且极少例外，ANSI或每个字符单元本机API的一个字节被消除。 The C# language and .NET Compact Framework also utilize UTF-16. C＃语言和.NET Compact Framework也使用UTF-16。

The Linux and Android native API's rely on single byte per character unit, null terminated strings. Linux和Android本机API依赖于每个字符单元的单字节，空终止字符串。 A wide C++ character on Linux is 4 bytes per character as opposed to 2 bytes per character unit on a Microsoft platform. Linux上的宽C ++字符是每个字符4个字节，而Microsoft平台上每个字符单元2个字节。 One effect is to double the length of all wide character strings including string literals preceded by the L character. 一种效果是使所有宽字符串的长度加倍，包括前面带有L字符的字符串文字。

One possibility is to translate UTF-16 including surrogate pairs to UTF-8 multi-byte strings which can require one to four bytes for each character, and can contain embedded zero bytes. 一种可能性是将包括代理对的UTF-16转换为UTF-8多字节字符串，每个字符可能需要一到四个字节，并且可以包含嵌入的零字节。 The Java Native Interface (JNI) provides routines to translate Java UTF-16 into “Modified” UTF-8. Java Native Interface（JNI）提供了将Java UTF-16转换为“Modified”UTF-8的例程。 The modifications result in a narrow character string containing no embedded zeros, only the zero at the end of the string. 修改导致一个窄字符串，不包含嵌入的零，只有字符串末尾的零。 Another modification is to translate a four byte UTF-16 surrogate pair into two UTF-8 characters, each three bytes long instead of a single UTF-8 character, four bytes long. 另一个修改是将四字节UTF-16代理对转换为两个UTF-8字符，每个字节长三个字节而不是一个UTF-8字符，长度为四个字节。

The end result of using JNI routines to translate between wide and narrow strings is that the wide UTF-16 string format is compatible both with Java and Windows Mobile (CE) and the narrow Modified UTF-8 string is compatible with Android / Linux OS API and C run time library. 使用JNI例程在宽字符串和窄字符串之间进行转换的最终结果是宽UTF-16字符串格式与Java和Windows Mobile（CE）兼容，并且窄的Modified UTF-8字符串与Android / Linux OS API兼容和C运行时库。

he Android C run time library (Bionic) contains a wchar.h to implement functions such as wcslen, wcscpy, etc., but as noted in the comments in the header, no actual wide char functions are implemented in the Android C run time library. 他的Android C运行时库（Bionic）包含一个wchar.h来实现wcslen，wcscpy等功能，但是如标题中的注释所述，在Android C运行时库中没有实现实际的宽字符函数。 We resolve this by using the GNU C++ compiler option “-fshort-wchar” which forces the compiler to treat wide characters as two bytes instead of four bytes. 我们通过使用GNU C ++编译器选项“-fshort-wchar”来解决这个问题，该选项强制编译器将宽字符视为两个字节而不是四个字节。 This makes the L”string” literal two bytes per character and compatible with UTF-16. 这使得L“string”文字每个字符两个字节并与UTF-16兼容。 We have extracted the actual wide character run time library from the Wine open source project. 我们从Wine开源项目中提取了实际的宽字符运行时库。

It is possible to use JNI as delivered in Android to translate between native C++ wide and narrow strings. 可以使用Android中提供的JNI在本机C ++范围和窄字符串之间进行转换。 This involves a round trip through the Java environment and thus is not very efficient. 这涉及到Java环境的往返，因此效率不高。 We elected to extract the core wide/narrow conversion routines from the Android implementation of JNI in the Android open source project so the conversion runs entirely in native code . 我们选择从Android开源项目中的JNI的Android实现中提取核心宽/窄转换例程，以便转换完全在本机代码中运行 。