简体   繁体   English

将unicode转换为char

[英]convert unicode to char

如何在Embarcadero C ++ 中将 Unicode字符串转换为char*char* const

String text = "Hello world";
char *txt = AnsiString(text).c_str();

Older text.t_str() is now AnsiString(String).c_str()

"Unicode string" really isn't specific enough to know what your source data is, but you probably mean 'UTF-16 string stored as wchar_t array' since that's what most people who don't know the correct terminology use. “ Unicode字符串”确实还不够具体,无法知道您的源数据是什么,但是您可能的意思是“ UTF-16字符串存储为wchar_t数组”,因为这是大多数不了解正确术语的人所使用的。

"char*" also isn't enough to know what you want to target, although maybe "embarcadero" has some convention. 尽管“ embarcadero”有一些约定,但“ char *”也不足以知道您要定位的目标。 I'll just assume you want UTF-8 data unless you mention otherwise. 除非另有说明,否则我将假设您需要UTF-8数据。

Also I'll limit my example to what works in VS2010 我也将我的例子限制在VS2010中

// your "Unicode" string
wchar_t const * utf16_string = L"Hello, World!";

// #include <codecvt>
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>,wchar_t> convert;

std::string utf8_string = convert.to_bytes(utf16_string);

This assumes that wchar_t strings are UTF-16, as is the case on Windows, but otherwise is portable code. 假设wchar_t字符串与Windows一样是UTF-16,否则是可移植代码。

You can reinterpret any array as an array of char pointers legally. 您可以合法地将任何数组重新解释为char指针的数组。 So if your Unicode data comes in 4-byte code units like 因此,如果您的Unicode数据采用4字节代码单位,例如

char32_t data[100];

then you can access it as a char array: 那么您可以将其作为char数组进行访问:

char const * p = reinterpret_cast<char const*>(data);

for (std::size_t i = 0; i != sizeof data; ++i)
{
    std::printf("Byte %03zu is 0x%02X.\n", i, p[i]);
}

That way, you can examine the individual bytes of your Unicode data one by one. 这样,您可以一一检查Unicode数据的各个字节。

(That has of course nothing to do with converting the encoding of your text. For that, use a library like iconv or ICU.) (当然,这与转换文本的编码无关。为此,请使用诸如iconv或ICU之类的库。)

If you work with Windows: 如果您使用Windows:

//#include <windows.h>
u16string utext = u"объява";
char text[0x100];
WideCharToMultiByte(CP_UTF8,NULL,(const wchar_t*)(utext.c_str()),-1,text,-1,NULL,NULL);
cout << text;

We can't use std::wstring_convert, wherefore is not available in MinGW 4.9.2. 我们不能使用std :: wstring_convert,因此在MinGW 4.9.2中不可用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM