简体   繁体   中英

Convert html mathjax to markdown with pandoc

I have some html files including mathjax commands. I would like to translate it into php extra markdown using pandoc.

The problem is that pandoc add "\\" before all math commands. For example \\begin{equation} \\$ x\\^2 etc

Do you know how to avoid that with pandoc ? I think a related question is this one : How to convert HTML with mathjax into latex using pandoc?

You can write a short Haskell program unescape.hs:

-- Disable backslash escaping of special characters when writing strings to markdown.
import Text.Pandoc

main = toJsonFilter unescape
  where unescape (Str xs) = RawInline "markdown" xs
        unescape x        = x

Now compile with ghc --make unescape.hs . And use with

pandoc -f html -t json | ./unescape | pandoc -f json -t markdown

This will disable escaping of special characters (like $ ) in markdown output.

A simpler approach might be to pipe pandoc's normal markdown output through sed:

pandoc -f html -t markdown | sed -e 's/\\\([$^_*]\)/\1/g'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM