简体   繁体   English

如何用Python编写这些Pandoc Haskell过滤器?

[英]How to write these Pandoc Haskell filters in Python?

Question

I need to convert these Pandoc Haskell filters to Python using [pandocfilters]. 我需要使用[pandocfilters]将这些Pandoc Haskell过滤器转换为Python。

#!/usr/bin/env runhaskell

import Text.Pandoc.JSON

main :: IO ()
main = toJSONFilter separator
  where separator (Para [Span ("",[],[]) [Str "___separator___"]])
          = RawBlock (Format "html") "<div class=\"separator\">***</div>"
        separator x = x
#!/usr/bin/env runhaskel

import Text.Pandoc.JSON

main :: IO ()
main = toJSONFilter separator
  where separator (Para [Span ("",[],[]) [Str "___separator___"]])
          = (Para [Span ("",[],[]) [Str "***"]])
        separator x = x

I expect it will be of the general form 我希望它将是一般形式

#!/usr/bin/env python

from pandocfilters import toJSONFilter, Str

def separator(key, value, format, meta):
    """Need to write this."""
    pass

if __name__ == '__main__':
    toJSONFilter(separator)

Bonus if someone knows how to add "centered" formatting to the second filter for the docx format. 如果有人知道如何将“居中”格式添加到docx格式的第二个过滤器中,则可加分。

Background 背景

In LaTeX I have a \\separator{} macro which makes three centered *** . 在LaTeX中,我有一个\\separator{}宏,使三个居中*** When processing this with Pandoc to html and docx, I use an alternate macro definition for \\separator{} which just creates the text ___separator___ . 使用Pandoc将其处理为html和docx时,我使用\\separator{}的替代宏定义,该宏定义仅创建文本___separator___ I then replace ___separator___ with content that works correctly in the new format. 然后,我将___separator___替换为在新格式下可以正常工作的内容。 I need to switch from Haskell to Python filters for cross system compatibility reasons. 由于跨系统兼容性的原因,我需要从Haskell切换到Python过滤器。

Example

Input file looks like 输入文件看起来像

\documentclass{memoir}
\begin{document}
    \newcommand{\separator}{\_\_\_separator\_\_\_}

    First paragraph.

    \separator{}

    Second paragraph.
\end{document}

Default pandoc html output with no filter 没有过滤器的默认Pandoc HTML输出

<p>First paragraph.</p>
<p><span>___separator___</span></p>
<p>Second paragraph.</p>

Required html output when filtered: 过滤后所需的html输出:

<p>First paragraph.</p>
<div class="separator">***</div>
<p>Second paragraph.</p>

The docx filter shoud ideally produce a centered paragraph with *** . docx过滤器应该理想地产生一个带有***的居中段落。

Have a look at this filter: 看一下这个过滤器:

#!/usr/bin/env python

import sys

from pandocfilters import toJSONFilter, Str, Para

def sep(key, value, format, meta):
    if key == 'Para':
       sys.stderr.write("--- Found a Para with value: " + str(value) + "\n")
       if len(value) == 1:
           if value[0]['t'] == 'Str' and value[0]['c'] == '---separator---':
               return Para( [ Str("FOUND A SEPARATOR") ] )
    return None

if __name__ == "__main__":
    toJSONFilter(sep)

When given this input markdown: 给定此输入减价后:

This is a paragraph.

---separator---

This is another paragraph.

---separator---

it will produce this output HTML via pandoc --filter ... input.md -o output.html : 它将通过pandoc --filter ... input.md -o output.html生成此输出HTML:

<p>This is a paragraph.</p>
<p>FOUND A SEPARATOR</p>
<p>This is another paragraph.</p>
<p>FOUND A SEPARATOR</p>

It also prints to stderr the structure of the Para nodes so that you can see exactly what they look like. 它还打印以stderr Para节点的结构,以便您可以准确看到它们的外观。

For LaTeX to HTML: 对于LaTeX到HTML:

#!/usr/bin/env python

from pandocfilters import toJSONFilter, Str, Para, RawBlock

def separator(key, value, format, meta):
    if key == 'Para':
       if len(value) == 1:
           try:
               if value[0]['c'][1][0]['c'] == '___separator___':
                   return RawBlock('html', '<div class="separator">***</div>')
           except KeyError:
               return None
    return Nonepprint

if __name__ == '__main__':
    toJSONFilter(separator)

For LaTeX to docx: 对于LaTeX到docx:

#!/usr/bin/env python

from pandocfilters import toJSONFilter, Str, Para, RawBlock

def separator(key, value, format, meta):
    if key == 'Para':
       if len(value) == 1:
           try:
               if value[0]['c'][1][0]['c'] == '___separator___':
                   return Para([Str('***')])

       except KeyError:
           return None
return None

if __name__ == '__main__':
    toJSONFilter(separator)

Note that this doesn't center the *** as I can't seem to work out how to inject that formatting. 请注意,这不会使***居中,因为我似乎无法弄清楚如何注入该格式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM