如何用Python编写这些Pandoc Haskell过滤器？

Question

Question 题

I need to convert these Pandoc Haskell filters to Python using [pandocfilters]. 我需要使用[pandocfilters]将这些Pandoc Haskell过滤器转换为Python。

#!/usr/bin/env runhaskell

import Text.Pandoc.JSON

main :: IO ()
main = toJSONFilter separator
  where separator (Para [Span ("",[],[]) [Str "___separator___"]])
          = RawBlock (Format "html") "<div class=\"separator\">***</div>"
        separator x = x

#!/usr/bin/env runhaskel

import Text.Pandoc.JSON

main :: IO ()
main = toJSONFilter separator
  where separator (Para [Span ("",[],[]) [Str "___separator___"]])
          = (Para [Span ("",[],[]) [Str "***"]])
        separator x = x

I expect it will be of the general form 我希望它将是一般形式

#!/usr/bin/env python

from pandocfilters import toJSONFilter, Str

def separator(key, value, format, meta):
    """Need to write this."""
    pass

if __name__ == '__main__':
    toJSONFilter(separator)

Bonus if someone knows how to add "centered" formatting to the second filter for the docx format. 如果有人知道如何将“居中”格式添加到docx格式的第二个过滤器中，则可加分。

Background 背景

In LaTeX I have a \\separator{} macro which makes three centered *** . 在LaTeX中，我有一个\\separator{}宏，使三个居中*** 。 When processing this with Pandoc to html and docx, I use an alternate macro definition for \\separator{} which just creates the text ___separator___ . 使用Pandoc将其处理为html和docx时，我使用\\separator{}的替代宏定义，该宏定义仅创建文本___separator___ 。 I then replace ___separator___ with content that works correctly in the new format. 然后，我将___separator___替换为在新格式下可以正常工作的内容。 I need to switch from Haskell to Python filters for cross system compatibility reasons. 由于跨系统兼容性的原因，我需要从Haskell切换到Python过滤器。

Example 例

Input file looks like 输入文件看起来像

\documentclass{memoir}
\begin{document}
    \newcommand{\separator}{\_\_\_separator\_\_\_}

    First paragraph.

    \separator{}

    Second paragraph.
\end{document}

Default pandoc html output with no filter 没有过滤器的默认Pandoc HTML输出

<p>First paragraph.</p>
<p><span>___separator___</span></p>
<p>Second paragraph.</p>

Required html output when filtered: 过滤后所需的html输出：

<p>First paragraph.</p>
<div class="separator">***</div>
<p>Second paragraph.</p>

The docx filter shoud ideally produce a centered paragraph with *** . docx过滤器应该理想地产生一个带有***的居中段落。

Answer 1

Have a look at this filter: 看一下这个过滤器：

#!/usr/bin/env python

import sys

from pandocfilters import toJSONFilter, Str, Para

def sep(key, value, format, meta):
    if key == 'Para':
       sys.stderr.write("--- Found a Para with value: " + str(value) + "\n")
       if len(value) == 1:
           if value[0]['t'] == 'Str' and value[0]['c'] == '---separator---':
               return Para( [ Str("FOUND A SEPARATOR") ] )
    return None

if __name__ == "__main__":
    toJSONFilter(sep)

When given this input markdown: 给定此输入减价后：

This is a paragraph.

---separator---

This is another paragraph.

---separator---

it will produce this output HTML via pandoc --filter ... input.md -o output.html : 它将通过pandoc --filter ... input.md -o output.html生成此输出HTML：

<p>This is a paragraph.</p>
<p>FOUND A SEPARATOR</p>
<p>This is another paragraph.</p>
<p>FOUND A SEPARATOR</p>

It also prints to stderr the structure of the Para nodes so that you can see exactly what they look like. 它还打印以stderr Para节点的结构，以便您可以准确看到它们的外观。

Answer 2

For LaTeX to HTML: 对于LaTeX到HTML：

#!/usr/bin/env python

from pandocfilters import toJSONFilter, Str, Para, RawBlock

def separator(key, value, format, meta):
    if key == 'Para':
       if len(value) == 1:
           try:
               if value[0]['c'][1][0]['c'] == '___separator___':
                   return RawBlock('html', '<div class="separator">***</div>')
           except KeyError:
               return None
    return Nonepprint

if __name__ == '__main__':
    toJSONFilter(separator)

For LaTeX to docx: 对于LaTeX到docx：

#!/usr/bin/env python

from pandocfilters import toJSONFilter, Str, Para, RawBlock

def separator(key, value, format, meta):
    if key == 'Para':
       if len(value) == 1:
           try:
               if value[0]['c'][1][0]['c'] == '___separator___':
                   return Para([Str('***')])

       except KeyError:
           return None
return None

if __name__ == '__main__':
    toJSONFilter(separator)

Note that this doesn't center the *** as I can't seem to work out how to inject that formatting. 请注意，这不会使***居中，因为我似乎无法弄清楚如何注入该格式。

如何用Python编写这些Pandoc Haskell过滤器？

问题描述

Question 题

Background 背景

Example 例

Input file looks like 输入文件看起来像

Default pandoc html output with no filter 没有过滤器的默认Pandoc HTML输出

Required html output when filtered: 过滤后所需的html输出：

2 个解决方案

解决方案1
3 2015-08-30 17:23:22

解决方案2
0 2015-08-30 19:22:49

For LaTeX to HTML: 对于LaTeX到HTML：

For LaTeX to docx: 对于LaTeX到docx：

如何用Python编写这些Pandoc Haskell过滤器？

问题描述

Question 题

Background 背景

Example 例

Input file looks like 输入文件看起来像

Default pandoc html output with no filter 没有过滤器的默认Pandoc HTML输出

Required html output when filtered: 过滤后所需的html输出：

2 个解决方案

解决方案1 3 2015-08-30 17:23:22

解决方案2 0 2015-08-30 19:22:49

For LaTeX to HTML: 对于LaTeX到HTML：

For LaTeX to docx: 对于LaTeX到docx：

解决方案1
3 2015-08-30 17:23:22

解决方案2
0 2015-08-30 19:22:49