简体   繁体   English

用于解析不是关键字的标识符名称的 PEG 语法

[英]PEG grammar to parse identifier name which is not a keyword

I am using Pest.rs for parsing.我正在使用 Pest.rs 进行解析。 I need to parse identifiers but reject them if they happen to be a reserved keyword.我需要解析标识符,但如果它们恰好是保留关键字,则拒绝它们。 For example, bat is a valid identifier name but this is not since that has a specific meaning.例如, bat是一个有效的标识符名称, this不是因为它具有特定含义。 My simplified grammar is as below.我的简化语法如下。

keyword = {"this" | "function"}
identifier = {ASCII+}
valid_identifier = { !keyword ~ identifier }

This works but it also rejects identifier names like thisBat .这有效,但它也拒绝像thisBat这样的标识符名称。 So basically it checks if that the prefix is not a keyword , but I want to check against the full identifier .所以基本上它检查前缀是否不是keyword ,但我想检查完整的identifier

Figured out a hack to address this.想出了一个破解方法来解决这个问题。

keyword = {"this" | "function"}
identifier = {ASCII+}
valid_identifier = @{ !keyword ~ identifier | keyword ~ identifier }

The new second rule in valid_identifier takes care of matching with the valid case which the first one rejects. valid_identifier中新的第二条规则负责匹配第一个拒绝的有效情况。 Note I have made valid_identifier atomic so that whitespaces are not inserted and the parse output is not like this and Bat , but a single thisBat .请注意,我已将valid_identifier原子,因此不会插入空格,并且解析 output 不像thisBat ,而是单个thisBat

Supposing that identifiers are composed of alphanumeric characters, another option is:假设标识符由字母数字字符组成,另一种选择是:

keyword = {"this" | "function"}
identifier = @{ !(keyword ~ !ASCII_ALPHANUMERIC) ~ ASCII_ALPHANUMERIC+ }

!(keyword ~ !ASCII_ALPHANUMERIC) rejects any identifier that starts with a keyword, as long as the character following the keyword can't be part of the identifier itself. !(keyword ~ !ASCII_ALPHANUMERIC)拒绝任何以关键字开头的标识符,只要关键字后面的字符不能是标识符本身的一部分。 This means that thisBat is an acceptable identifier, but this is not.这意味着thisBat是可接受的标识符,但this并非如此。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM