简体   繁体   English

如何从西里尔字符串C ++中获取一个字符

[英]How can I get one character from cyrillic string c++

I have wstring with cyrillic word. 我有西里尔字母。 I need to get one letter from it. 我需要从中收到一封信。 I found only this way: 我只是这样发现的:

wstring line;
wifstream myfile (".../outfile.txt");
if (myfile.is_open())
{
    while (myfile.good())
    {
        getline (myfile,line);
        wstring a = line.substr(0,2); // this gives one first letter
       //....
    }
    myfile.close();
}

Are there better ways to get a letter from cyrillic string? 有没有更好的方法来获取西里尔字母的字母?

If cyrillic uses surrogate pairs in UTF-16 encoding, instead of doing this: 如果西里尔文使用UTF-16编码中的代理对,则不要这样做:

wstring a = line.substr(0,2);

you might want to consider doing something similar to this: 您可能要考虑做类似以下的事情:

const wchar_t surrogate[] = { line[0], line[1], L'\0' };
const wchar_t non_surrogate[] = { line[0], L'\0' };
const wstring a = IS_SURROGATE_PAIR(surrogate[0], surrogate[1]) ?
                  surrogate :
                  non_surrogate; 

The IS_SURROGATE_PAIR macro is for Windows - if you are elsewhere you can do the check yourself by reading up on the macro link and on its accompanying Surrogates and Supplementary Characters docs. IS_SURROGATE_PAIR宏适用于Windows-如果您在其他地方,则可以通过阅读宏链接及其随附的“ 代理和补充字符”文档来进行检查。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM