[英]How do I parse this with peg grammar?
I'm trying to make a parser using pegjs . 我正在尝试使用pegjs制作解析器。 I need to parse something like:
我需要解析类似的东西:
blah blah START Lorem ipsum
dolor sit amet, consectetur
adipiscing elit END foo bar
etc.
I have trouble writing the rule to catch the text from "START"
to "END"
. 我无法编写规则来捕获从
"START"
到"END"
的文本。
Use negative lookahead predicates: 使用否定前瞻谓词:
phrase
=(!"START" .)* "START" result:(!"END" .)* "END" .* {
for (var i=0;i<result.length;++i)
// remove empty element added by predicate matching
{result[i]=result[i][1];
}
return result.join("");
}
You need to use a negative predicate for END as well as START because repetition in pegjs is greedy. 您需要对END和START使用否定谓词,因为pegjs中的重复是贪婪的。
Alternatively, the action could be written as 或者,该动作可以写成
{return result.join("").split(',').join("");}
Although this relies on not-necessarily documented behavior of join
when dealing with nested arrays (namely that it joins the sub-arrays with commas and then concatenates them).虽然这依赖于在处理嵌套数组时不一定记录的 join
行为(即它用逗号连接子数组然后连接它们)。
[UPDATE] A shorter way to deal with the empty elements is [更新]处理空元素的更简单方法是
phrase
=(!"START" .)* "START" result:(t:(!"END" .){return t[1];})* "END" .* {
return result.join("");
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.