[英]c++ making a unicode char from a string
我有這樣的字符串
string s = "0081";
我需要像這樣制作一個字符串
string c = "\u0081"
如何從長度為4的原始字符串中提取長度為1的字符串?
編輯:我的錯誤,“ \\ u0081”不是char(1字節),而是2字節的字符/字符串? 所以我輸入的是二進制數1000 0001,即0x81,這就是我的字符串“ 0081”的來源。 從0x81到字符串c =“ \\ u0081”會更容易嗎? 感謝所有的幫助
干得好:
unsigned int x;
std::stringstream ss;
ss << std::hex << "1081";
ss >> x;
wchar_t wc1 = x;
wchar_t wc2 = L'\u1081';
assert(wc1 == wc2);
std::wstring ws(1, wc);
這是整個過程,基於我在其他注釋中鏈接的一些代碼。
string s = "0081";
long codepoint = strtol(s.c_str(), NULL, 16);
string c = CodepointToUTF8(codepoint);
std::string CodepointToUTF8(long codepoint)
{
std::string out;
if (codepoint <= 0x7f)
out.append(1, static_cast<char>(codepoint));
else if (codepoint <= 0x7ff)
{
out.append(1, static_cast<char>(0xc0 | ((codepoint >> 6) & 0x1f)));
out.append(1, static_cast<char>(0x80 | (codepoint & 0x3f)));
}
else if (codepoint <= 0xffff)
{
out.append(1, static_cast<char>(0xe0 | ((codepoint >> 12) & 0x0f)));
out.append(1, static_cast<char>(0x80 | ((codepoint >> 6) & 0x3f)));
out.append(1, static_cast<char>(0x80 | (codepoint & 0x3f)));
}
else
{
out.append(1, static_cast<char>(0xf0 | ((codepoint >> 18) & 0x07)));
out.append(1, static_cast<char>(0x80 | ((codepoint >> 12) & 0x3f)));
out.append(1, static_cast<char>(0x80 | ((codepoint >> 6) & 0x3f)));
out.append(1, static_cast<char>(0x80 | (codepoint & 0x3f)));
}
return out;
}
請注意,此代碼不會進行任何錯誤檢查,因此,如果將無效的代碼點傳遞給它,則會返回無效的字符串。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.