简体   繁体   English

在Linux中是否有任何将wstring或wchar_t *转换为UTF-8的内置函数?

[英]Is there any built-in function that convert wstring or wchar_t* to UTF-8 in Linux?

I want to convert wstring to UTF-8 Encoding, but I want to use built-in functions of Linux. 我想将wstring转换为UTF-8编码,但是我想使用Linux的内置函数。

Is there any built-in function that convert wstring or wchar_t* to UTF-8 in Linux with simple invokation ? 在Linux中是否有任何内置函数可以通过简单的调用wstringwchar_t*转换为UTF-8?

Example: 例:

wstring str = L"file_name.txt";
wstring mode = "a";
fopen([FUNCTION](str), [FUNCTION](mode)); // Simple invoke.
cout << [FUNCTION](str); // Simple invoke.

If/when your compiler supports enough of C++11, you could use wstring_convert 如果/当您的编译器支持足够的C ++ 11时,可以使用wstring_convert

#include <iostream>
#include <codecvt>
#include <locale>
int main()
{
    std::wstring_convert<std::codecvt_utf8<wchar_t>> utf8_conv;
    std::wstring str = L"file_name.txt";
    std::cout << utf8_conv.to_bytes(str) << '\n';
}

tested with clang++ 2.9/libc++ on Linux and Visual Studio 2010 on Windows. 在Linux上使用clang ++ 2.9 / libc ++和Windows上使用Visual Studio 2010进行了测试。

The C++ language standard has no notion of explicit encodings. C ++语言标准没有显式编码的概念。 It only contains an opaque notion of a "system encoding", for which wchar_t is a "sufficiently large" type. 它仅包含“系统编码”的不透明概念,其wchar_t是“足够大”的类型。

To convert from the opaque system encoding to an explicit external encoding, you must use an external library. 要将不透明的系统编码转换为显式的外部编码,必须使用外部库。 The library of choice would be iconv() (from WCHAR_T to UTF-8 ), which is part of Posix and available on many platforms, although on Windows the WideCharToMultibyte functions is guaranteed to produce UTF8. 选择的库将是iconv() (从WCHAR_TUTF-8 ),它是Posix的一部分,可在许多平台上使用,尽管可以保证Windows上的WideCharToMultibyte函数可以生成UTF8。

C++11 adds new UTF8 literals in the form of std::string s = u8"Hello World: \\U0010FFFF"; C ++ 11以std::string s = u8"Hello World: \\U0010FFFF";的形式添加了新的UTF8 文字 std::string s = u8"Hello World: \\U0010FFFF"; . Those are already in UTF8, but they cannot interface with the opaque wstring other than through the way I described. 那些已经在UTF8中了,但是除了我所描述的方式之外,它们无法与不透明的wstring交互。

See this question for a bit more background. 有关更多背景知识, 请参见此问题

It's quite plausible that wcstombs will do what you need if what you actually want to do is convert from wide characters to the current locale. 如果您真正想做的是将宽字符转换为当前语言环境,则wcstombs将满足您的需求,这是很合理的。

If not then you probably need to look to ICU, boost or similar. 如果没有,那么您可能需要使用ICU,boost或类似产品。

Certainly there is no function built in on Linux, because the name Linux references the kernel only, which doesn't have anything to with it. 当然,在Linux上没有内置任何功能,因为Linux的名称仅引用内核,与内核没有任何关系。 I seriously doubt that the libc that comes with gcc has such a function, and 我严重怀疑gcc随附的libc是否具有这样的功能,并且

$ man -k utf

supports this theory. 支持这一理论。 But there are plenty of good UTF-8 libraries around. 但是周围有很多不错的UTF-8库。 I personally recommend the iconv library for such conversions. 我个人建议使用iconv库进行此类转换。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM