简体   繁体   English

如何在python中解析restructuredtext?

[英]How to parse restructuredtext in python?

Is there any module that can parse restructuredtext into a tree model? 是否有任何模块可以将重构文本解析为树模型?

Can docutils or sphinx do this? docutils或sphinx可以这样做吗?

Docutils does indeed contain the tools to do this. Docutils确实包含了执行此操作的工具。

What you probably want is the parser at docutils.parsers.rst 你可能想要的是docutils.parsers.rst的解析器

See this page for details on what is involved. 有关所涉及的详细信息,请参阅此页面 There are also some examples at docutils/examples.py - particularly check out the internals() function, which is probably of interest. docutils/examples.py中也有一些例子 - 特别是检查internals()函数,这可能是有意义的。

I'd like to extend upon the answer from Gareth Latty. 我想扩展Gareth Latty的答案。 "What you probably want is the parser at docutils.parsers.rst " is a good starting point of the answer, but what's next? “你可能想要的是docutils.parsers.rst的解析器”是答案的一个很好的起点,但下一步是什么? Namely: 即:

How to parse restructuredtext in python? 如何在python中解析restructuredtext?

Below is the exact answer for Python 3.6 and docutils 0.14: 下面是Python 3.6和docutils 0.14的确切答案:

import docutils.nodes
import docutils.parsers.rst
import docutils.utils

def parse_rst(text: str) -> docutils.nodes.document:
    parser = docutils.parsers.rst.Parser()
    components = (docutils.parsers.rst.Parser,)
    settings = docutils.frontend.OptionParser(components=components).get_default_values()
    document = docutils.utils.new_document('<rst-doc>', settings=settings)
    parser.parse(text, document)
    return document

And the resulting document can be processed using, for example, below, which will print all references in the document: 并且可以使用例如下面的处理来处理结果文档,这将打印文档中的所有引用:

class MyVisitor(docutils.nodes.NodeVisitor):

    def visit_reference(self, node: docutils.nodes.reference) -> None:
        """Called for "reference" nodes."""
        print(node)

    def unknown_visit(self, node: docutils.nodes.Node) -> None:
        """Called for all other node types."""
        pass

Here's how to run it: 以下是如何运行它:

doc = parse_rst('spam spam lovely spam')
visitor = MyVisitor(doc)
doc.walk(visitor)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM