[英]Use lark to analyze reST markup language like sections
我想定義一種基本語法,例如開始使用lark
。 這是我的 M(not)WE。
from lark import Lark
GRAMMAR = r"""
?start: _NL* (day_heading)*
day_heading : "==" _NL day_nb _NL "==" _NL+ (paragraph _NL)*
day_nb : /\d{2}/
paragraph : /[^\n={2}]+/ (_NL+ paragraph)*
_NL : /(\r?\n[\t ]*)+/
"""
parser = Lark(GRAMMAR)
tree = parser.parse("""
==
12
==
Bla, bla
Bli, Bli
Blu, Blu
==
10
==
Blo, blo
""")
print(tree.pretty())
這打印:
start
day_heading
day_nb 12
paragraph
Bla, bla
paragraph
Bli, Bli
paragraph Blu, Blu
day_heading
day_nb 10
paragraph Blo, blo
在 YAML 格式中,樹是下面的一個。
- section:
- day: 12
- contents:
- paragraph: |
Bla, bla
Bli, bli
- paragraph: "Blu, blu"
- section:
- day: 10
- content:
- paragraph: "Blo, blo"
如何修改我的 EBNF?
這是一個可能的答案。 用NL
替換_NL
允許保留新行。
from lark import Lark
GRAMMAR = r"""
?start: _NL* (day_heading)*
day_heading : "==" _NL day_nb _NL "==" _NL+ (paragraph)+
day_nb : /\d{2}/
paragraph : (line _NL)+
line : /[^\n={2}]+/
_NL : /(\r?\n[\t ]*)+/
"""
parser = Lark(GRAMMAR)
tree = parser.parse("""
==
12
==
Bla, bla
Bli, Bli
Blu, Blu
==
10
==
Blo, blo
""")
print(tree.pretty())
這會產生:
start
day_heading
day_nb 12
paragraph
line Bla, bla
line Bli, Bli
line Blu, Blu
day_heading
day_nb 10
paragraph
line Blo, blo
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.