[英]How to find title a la reStructuredText
Is there a regex pattern to match titles in the following reStructuredText -like text ? 是否有正则表达式模式匹配以下reStructuredText类文本中的标题? The difficulty is that the numbers of equal signs must be equal to the length of the title. 困难在于等号的数量必须等于标题的长度。
Some basic text.
=========
One Title
=========
For titles the numbers of sign `=` must be equal to the length of the text title.
=============
Another title
=============
And so on...
Search for match(es) of (?:^|\\n)(=+)\\r?\\n(?!=)([^\\n\\r]+)\\r?\\n(=+)(?:\\r?\\n|$)
. 搜索匹配项(?:^|\\n)(=+)\\r?\\n(?!=)([^\\n\\r]+)\\r?\\n(=+)(?:\\r?\\n|$)
。 If match found, check if lengths of first, second and third groups are same. 如果找到匹配,请检查第一,第二和第三组的长度是否相同。 If so, title is a content of second group. 如果是,则title是第二组的内容。
To support full syntax for section titles , you could use docutils
package : 要支持节标题的完整语法 ,可以使用docutils
包 :
#!/usr/bin/env python3
"""
some text
=====
Title
=====
Subtitle
--------
Titles are underlined (or over- and underlined) with a printing
nonalphanumeric 7-bit ASCII character. Recommended choices are "``= -
` : ' " ~ ^ _ * + # < >``". The underline/overline must be at least
as long as the title text.
A lone top-level (sub)section is lifted up to be the document's (sub)title.
"""
from docutils.core import publish_doctree
def section_title(node):
"""Whether `node` is a section title.
Note: it DOES NOT include document title!
"""
try:
return node.parent.tagname == "section" and node.tagname == "title"
except AttributeError:
return None # not a section title
# get document tree
doctree = publish_doctree(__doc__)
titles = doctree.traverse(condition=section_title)
print("\n".join([t.astext() for t in titles]))
Title
Subtitle
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.