简体   繁体   English

什么是“ Java标识符中的可忽略字符”

[英]What's an “ignorable character in a Java identifier”

I stumbled across this doc and wondered what that was all about. 我偶然发现了这个文档 ,想知道这是怎么回事。 Apparently you can have certain control characters inside identifiers and they are ignored: 显然,标识符中可以包含某些控制字符,它们将被忽略:

public static void main(String[] args) throws Exception {
    int dummy = 123;
    System.out.println(d​ummy); // Has U+200B after the `d` before the `u`
}

I couldn't find anything about this in the JLS. 我在JLS中找不到有关此的任何信息。 IntelliJ IDEA gives an error in the editor saying "dummy" is an undeclared identifier (but nevertheless it compiles and runs). IntelliJ IDEA在编辑器中给出错误,指出“虚拟”是一个未声明的标识符(但仍然可以编译和运行)。 I guess that's an error in IntelliJ? 我猜这是IntelliJ中的错误吗? What purpose do these "ignoreable characters" serve? 这些“不可忽视的角色”的目的是什么?

(Note: StackOverflow seems to remove my control characters from the question) (注意:StackOverflow似乎从问题中删除了我的控制字符)

There is an open issue for this contradiction. 这个矛盾有一个公开的问题

In summary, these characters are indeed ignored for identifier name matching by the compiler but JLS doesn't mention this. 总之,编译器的标识符名称匹配确实忽略了这些字符,但是JLS并未提及。 Instead JLS says : 相反, JLS说

Two identifiers are the same only if they are identical, that is, have the same Unicode character for each letter or digit. 两个标识符只有在它们相同的情况下才是相同的,也就是说,每个字母或数字具有相同的Unicode字符。

Also

A "Java letter-or-digit" is a character for which the method Character.isJavaIdentifierPart(int) returns true “ Java字母或数字”是一种字符,其方法Character.isJavaIdentifierPart(int)返回true

The contradiction is obvious as: 矛盾显而易见:

Character.isJavaIdentifierPart('\u0001')  -> true, so used to compare identifier names
Character.isIdentifierIgnorable('\u0001') -> true, should be ignored actually

I speculate that Intellij IDEA follows the JLS or they are simply unaware of ignorable characters. 我推测Intellij IDEA遵循JLS,或者它们只是不知道可忽略的字符。 I don't see a bug report for this here . 在这里看不到任何错误报告。

As to what is the purpose of these ignorables, unicode specifies some Layout and Format Control Characters . 至于这些可点火对象的目的是什么,unicode指定了一些布局和格式控制字符 It is suggested that these characters should be ignored in identifier names as 建议在标识符名称中忽略这些字符,因为

the effects they represent are stylistic or otherwise out of scope for identifiers, and second because the characters themselves often have no visible display 它们所代表的效果是风格上的,或者超出了标识符的范围,其次是因为字符本身通常没有可见的显示

Apparently the purpose of isIdentifierIgnorable is to identify characters of this category. 显然, isIdentifierIgnorable的目的是识别此类别的字符。 For instance it's mentioned in the isIdentifierIgnorable documentation that it returns true for characters that have the FORMAT general category value which are characters with unicode General_Category value of Cf which are included in the Layout and Format Control Characters 例如,在isIdentifierIgnorable文档中提到,对于具有FORMAT常规类别值的字符 ,它返回true ,这些字符是Layout和Format Control Characters中包含Cf的 unicode General_Category值的 字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Java:为什么要定义这样的字符? - Java:What's reason to define a character like this? 在Java中,以字符形式“构建”和使用字符串的最快方法是什么? - In Java, what's the fastest way to “build” and use a string, character by character? JSON 到 Java 对象 - 无法识别的字段,未标记为可忽略 - JSON to Java object - Unrecognized field, not marked as ignorable jackson java无法识别的字段未标记为可忽略 - jackson java Unrecognized field not marked as ignorable 检查字符是否是Java中的元音的最佳方法是什么? - What's the best way to check if a character is a vowel in Java? 在.NET中,与Java的Character.isDefined最接近的是什么? - What is the nearest equivalent to Java's Character.isDefined, in .NET? 我的代码有什么问题(Java 字符输入初学者) - What's wrong in my code (Java character input beginner) Java正则表达式中字符类交叉点的背后是什么? - What is the point behind character class intersections in Java's Regex? 发生了什么以及如何修复:无效的模块名称:“tomcat-embed-jasper-el”不是 Java 标识符 - What's going on and how to fix: Invalid module name: 'tomcat-embed-jasper-el' is not a Java identifier 无法识别字段-未标记为可忽略错误-JSON-Java对象 - unable to recognize field - not marked as ignorable error - JSON - Java object
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM