简体   繁体   English

我可以在 C++ 中使用非 ASCII 字符创建变量名吗?

[英]Can I create variable name with non-ASCII characters in c++?

I write in c++ since 2010.我从 2010 年开始用 C++ 编写。

I've just accidentally inputed “й“ letter in my code, hover mouse on it to remove, and noticed that Visual Studio just says there is no variable “й“.我只是不小心在我的代码中输入了“й”字母,将鼠标悬停在它上面以删除,并注意到 Visual Studio 只是说没有变量“й”。

I wrote int й = 1;我写了int й = 1; and it just compiled!它刚刚编译!

What did I miss?我错过了什么?


Bet, it's probably features of c++ 11,14 or something like this.打赌,它可能是 c++ 11,14 或类似的功能。

Here's what The Standard says ([lex.phases]):这是标准所说的([lex.phases]):

Physical source file characters are mapped, in an implementation-defined manner, to the basic source character set (introducing new-line characters for end-of-line indicators) if necessary.如有必要,物理源文件字符以实现定义的方式映射到基本源字符集(为行尾指示符引入换行符)。 The set of physical source file characters accepted is implementation-defined.接受的物理源文件字符集是实现定义的。

So your particular implementation supports that, but it's not guaranteed to be portable to any other implementation.因此,您的特定实现支持这一点,但不能保证可移植到任何其他实现。

If you look at Annex E of this paper , you can see that there are certain unicode ranges allowed to be variable names.如果您查看本文的附件 E,您可以看到某些 unicode 范围允许作为变量名。 These ranges include:这些范围包括:

00A8, 00AA, 00AD, 00AF, 00B2-00B5, 00B7-00BA, 00BC-00BE, 00C0-00D6, 00D8-00F6, 00F8-00FF 0100-167F, 1681-180D, 180F-1FFF 200B-200D, 202A-202E, 203F-2040, 2054, 2060-206F 2070-218F, 2460-24FF, 2776-2793, 2C00-2DFF, 2E80-2FFF 3004-3007, 3021-302F, 3031-303F 3040-D7FF F900-FD3D, FD40-FDCF, FDF0-FE44, FE47-FFFD 10000-1FFFD, 20000-2FFFD, 30000-3FFFD, 40000-4FFFD, 50000-5FFFD, 60000-6FFFD, 70000-7FFFD, 80000-8FFFD, 90000-9FFFD, A0000-AFFFD, B0000-BFFFD, C0000-CFFFD, D0000-DFFFD, E0000-EFFFD 00A8, 00AA, 00AD, 00AF, 00B2-00B5, 00B7-00BA, 00BC-00BE, 00C0-00D6, 00D8-00F6, 00F8-00FF 0100-167F, 1802F-20F-20F-20F-20E-16D-16D 203F-2040, 2054, 2060-206F 2070-218F, 2460-24FF, 2776-2793, 2C00-2DFF, 2E80-2FFF 3004-3007, 3021-30FD30F30F30F30F30F30F3030F3 FDF0-FE44, FE47-FFFD 10000-1FFFD, 20000-2FFFD, 30000-3FFFD, 40000-4FFFD, 50000-5FFFD, 60000-6FFFD, 70000-7FFFD, 8000000000FF-FD, 80000000000000000000FF , C0000-CFFFD, D0000-DFFFD, E0000-EFFFD

Well, seems there is no restriction on the Unicode character using to define an identifier according to MSDN好吧,似乎对根据MSDN定义标识符的 Unicode 字符没有限制

struct テスト         // Japanese     'test'
{
void トスト() {}  // Japanese      'toast'
};

 int main() {
テスト \u30D1\u30F3;  // Japanese      パン 'bread' in UCN form
パン.トスト();        // compiler      recognizes UCN or literal form
 }

I am disappointed that cplusplus.com has no word about this.我很失望cplusplus.com对此一无所知。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM