[英]PEG grammar to parse identifier name which is not a keyword
I am using Pest.rs for parsing.我正在使用 Pest.rs 进行解析。 I need to parse identifiers but reject them if they happen to be a reserved keyword.我需要解析标识符,但如果它们恰好是保留关键字,则拒绝它们。 For example, bat
is a valid identifier name but this
is not since that has a specific meaning.例如, bat
是一个有效的标识符名称, this
不是因为它具有特定含义。 My simplified grammar is as below.我的简化语法如下。
keyword = {"this" | "function"}
identifier = {ASCII+}
valid_identifier = { !keyword ~ identifier }
This works but it also rejects identifier names like thisBat
.这有效,但它也拒绝像thisBat
这样的标识符名称。 So basically it checks if that the prefix is not a keyword
, but I want to check against the full identifier
.所以基本上它检查前缀是否不是keyword
,但我想检查完整的identifier
。
Figured out a hack to address this.想出了一个破解方法来解决这个问题。
keyword = {"this" | "function"}
identifier = {ASCII+}
valid_identifier = @{ !keyword ~ identifier | keyword ~ identifier }
The new second rule in valid_identifier
takes care of matching with the valid case which the first one rejects. valid_identifier
中新的第二条规则负责匹配第一个拒绝的有效情况。 Note I have made valid_identifier
atomic so that whitespaces are not inserted and the parse output is not like this
and Bat
, but a single thisBat
.请注意,我已将valid_identifier
原子,因此不会插入空格,并且解析 output 不像this
和Bat
,而是单个thisBat
。
Supposing that identifiers are composed of alphanumeric characters, another option is:假设标识符由字母数字字符组成,另一种选择是:
keyword = {"this" | "function"}
identifier = @{ !(keyword ~ !ASCII_ALPHANUMERIC) ~ ASCII_ALPHANUMERIC+ }
!(keyword ~ !ASCII_ALPHANUMERIC)
rejects any identifier that starts with a keyword, as long as the character following the keyword can't be part of the identifier itself. !(keyword ~ !ASCII_ALPHANUMERIC)
拒绝任何以关键字开头的标识符,只要关键字后面的字符不能是标识符本身的一部分。 This means that thisBat
is an acceptable identifier, but this
is not.这意味着thisBat
是可接受的标识符,但this
并非如此。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.