简体   繁体   中英

Python (3.4+) Regex for template extensions

I have the following "sample" content:

{% block some_name %}Some Text{% endblock %} 
Something Else
{% block another_name %}Some Other Content{% endblock %}

And I am trying to regex to find both blocks, first the names, then after that the sections, but am only receiving the first back from my "findall" action:

re.findall(r"\{% block ([^\{%]+?) %\}[\s\S]*\{% endblock %\}", contents)

** assuming variable "contents" is the string at the top.

So i need two searches, or combined if possible, returning me something alike:

list[
    ['some_name', 'another_name'],
    ['{% block some_name %}Some Text{% endblock %}', '{% block another_name %}Some Other Content{% endblock %}']
]

You may use

r'(\{%\s*block\s+(.+?)\s*%}[\s\S]*?\{%\s*endblock\s*(?:\2\s*)?%})'

See the regex demo .

Details

  • ( - start of an outer capturing group #1 (as as to get all the match into the list of tuples returned by re.findall ):
    • `{%
    • \\s* - 0+ whitespaces
    • block - a block substring
    • \\s+ - 1+ whitespaces
    • (.+?) - 1+ chars other than line break chars (replace with [\\s\\S] to also match line breaks), as few as possible, captured into Group 2
    • \\s* - 0+ whitespaces
    • %} - % substring
    • [\\s\\S]*? - any 0+ chars as few as possible
    • \\{% - a {% substring
    • \\s* - 0+ whitespaces
    • endblock - a literal substring
    • \\s* - 0+ whitespaces
    • (?:\\2\\s*)? - an optional sequence of the Group 2 value and 0+ whitespaces after
    • %} - a %} substring
  • ) - end of the outer capturing group #1.

See the Python demo :

import re
rx = r'(\{%\s*block\s+(.+?)\s*%}[\s\S]*?\{%\s*endblock\s*(?:\2\s*)?%})'
s = '{% block some_name %}Some Text{% endblock %} \nSomething Else\n{% block another_name %}Some Other Content{% endblock %}'
print(list(map(list, zip(*re.findall(rx, s))))) # Extracting and transposing the list
# => [['{% block some_name %}Some Text{% endblock %}', '{% block another_name %}Some Other Content{% endblock %}'], ['some_name', 'another_name']]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM