Following up from an earlier question , I'm a bit confused about the precedence of the /.+/
regex line; I would expect the below test to produce
line
line x
chunk abc
instead I get:
line
line x
line abc
def test_tokenizing(self):
p = Lark(r"""
_NL: /\n/
line.-1: /.+/? _NL
chunk: /abc/ _NL
start: (line|chunk)+
""", parser='lalr')
text = '\nx\nabc\n'
print(p.parse(text).pretty())
In Lark, priorities mean different things for rules and for terminals.
Just a quick reminder, rules have lowercase names, while terminals have UPPERCASE names.
In LALR mode, priorities on rules only affect which one is chosen in case of a reduce/reduce collision. It has no effect on the terminals inside it.
What you want is to change the priority on the terminal itself:
def test_tokenizing():
p = Lark(r"""
_NL: /\n/
line: EVERYTHING? _NL
EVERYTHING.-1: /.+/
chunk: /abc/ _NL
start: (line|chunk)+
""", parser='lalr')
text = '\nx\nabc\n'
print(p.parse(text).pretty())
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.