如何从西里尔字符串C ++中获取一个字符

Question

I have wstring with cyrillic word. 我有西里尔字母。 I need to get one letter from it. 我需要从中收到一封信。 I found only this way: 我只是这样发现的：

wstring line;
wifstream myfile (".../outfile.txt");
if (myfile.is_open())
{
    while (myfile.good())
    {
        getline (myfile,line);
        wstring a = line.substr(0,2); // this gives one first letter
       //....
    }
    myfile.close();
}

Are there better ways to get a letter from cyrillic string? 有没有更好的方法来获取西里尔字母的字母？

Answer 1

If cyrillic uses surrogate pairs in UTF-16 encoding, instead of doing this: 如果西里尔文使用UTF-16编码中的代理对，则不要这样做：

wstring a = line.substr(0,2);

you might want to consider doing something similar to this: 您可能要考虑做类似以下的事情：

const wchar_t surrogate[] = { line[0], line[1], L'\0' };
const wchar_t non_surrogate[] = { line[0], L'\0' };
const wstring a = IS_SURROGATE_PAIR(surrogate[0], surrogate[1]) ?
                  surrogate :
                  non_surrogate;

The IS_SURROGATE_PAIR macro is for Windows - if you are elsewhere you can do the check yourself by reading up on the macro link and on its accompanying Surrogates and Supplementary Characters docs. IS_SURROGATE_PAIR宏适用于Windows-如果您在其他地方，则可以通过阅读宏链接及其随附的“ 代理和补充字符”文档来进行检查。

如何从西里尔字符串C ++中获取一个字符

问题描述

1 个解决方案

解决方案1
0 2015-03-04 09:52:59

如何从西里尔字符串C ++中获取一个字符

问题描述

1 个解决方案

解决方案1 0 2015-03-04 09:52:59

解决方案1
0 2015-03-04 09:52:59