簡體   English   中英

用lark分析reST 標記語言like sections

[英]Use lark to analyze reST markup language like sections

我想定義一種基本語法,例如開始使用lark 這是我的 M(not)WE。

from lark import Lark

GRAMMAR = r"""
?start: _NL* (day_heading)*

day_heading : "==" _NL day_nb _NL "==" _NL+ (paragraph _NL)*
day_nb      : /\d{2}/
paragraph   : /[^\n={2}]+/ (_NL+ paragraph)*
_NL         : /(\r?\n[\t ]*)+/
"""

parser = Lark(GRAMMAR)

tree = parser.parse("""


==
12
==

Bla, bla
Bli, Bli



Blu, Blu


==
10
==


Blo, blo


    """)

print(tree.pretty())

這打印:

start
  day_heading
    day_nb      12
    paragraph
      Bla, bla
      paragraph
        Bli, Bli
        paragraph       Blu, Blu
  day_heading
    day_nb      10
    paragraph   Blo, blo

在 YAML 格式中,樹是下面的一個。

- section:
  - day: 12
  - contents:
    - paragraph: |
      Bla, bla
      Bli, bli
    - paragraph: "Blu, blu"

- section:
  - day: 10
  - content:
    - paragraph: "Blo, blo"

如何修改我的 EBNF?

這是一個可能的答案。 NL替換_NL允許保留新行。

from lark import Lark

GRAMMAR = r"""
?start: _NL* (day_heading)*

day_heading : "==" _NL day_nb _NL "==" _NL+ (paragraph)+
day_nb      : /\d{2}/

paragraph : (line _NL)+

line : /[^\n={2}]+/
_NL  : /(\r?\n[\t ]*)+/
"""

parser = Lark(GRAMMAR)

tree = parser.parse("""


==
12
==

Bla, bla
Bli, Bli



Blu, Blu


==
10
==


Blo, blo


    """)

print(tree.pretty())

這會產生:

start
  day_heading
    day_nb      12
    paragraph
      line      Bla, bla
      line      Bli, Bli
      line      Blu, Blu
  day_heading
    day_nb      10
    paragraph
      line      Blo, blo

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM