简体   繁体   English

使用 Pandoc 将 Markdown 链接转换为 HTML

[英]Convert Markdown links to HTML with Pandoc

In my new project I have multiple Markdown files which are linked to each other.在我的新项目中,我有多个相互链接的 Markdown 文件。 These links refer to the original .md files.这些链接引用原始.md文件。

Example:例子:

File README.md文件README.md

...
1. [Development documentation](Development.md)
1. [User documentation](Usage.md)
...

If I convert these files with Pandoc, eg, to HTML files, all links are still pointing to the original .md file.如果我使用 Pandoc 将这些文件转换为 HTML 文件,所有链接仍指向原始.md文件。 I'm looking for a way to also convert the link type, which means that output files should refer to the output file type such as HTML, PDF, TeX , etc. Is there a way to convert the internal link type with Pandoc?我正在寻找一种同时转换链接类型的方法,这意味着输出文件应该引用输出文件类型,例如 HTML、PDF、 TeX等。有没有办法用 Pandoc 转换内部链接类型?

I use this to convert the files:我用它来转换文件:

pandoc -f markdown -t html5 input.md -o output.html

Example with the built-in Lua filters :内置 Lua 过滤器的示例:

# links-to-html.lua
function Link(el)
  el.target = string.gsub(el.target, "%.md", ".html")
  return el
end

Then:然后:

pandoc -f markdown -t html5 input.md -o output.html --lua-filter=links-to-html.lua

You can create a filter that checks every link element and—if the URL ends with .md —replaces it with .html .您可以创建一个过滤器来检查每个链接元素,如果 URL 以.md结尾,则将其替换为.html

Example with Python, using the panflute package: Python 示例,使用panflute包:

import panflute as pf

def action(elem, doc):
    if isinstance(elem, pf.Link) and elem.url.endswith('.md'):
        elem.url = elem.url[:-3] + '.html'
        return elem

if __name__ == '__main__':
    pf.run_filter(action)

Assuming you are going to serve you HTML pages via a web server, it is relatively simple to resolve all *.md URLs as *.html ones instead of rewriting them via Pandoc, eg, using NGinx :假设您要通过 Web 服务器为您提供 HTML 页面,将所有*.md URL 解析为*.html相对简单,而不是通过 Pandoc 重写它们,例如,使用NGinx

location ~ \.md$ {
  if (!-f $request_filename) {
    rewrite ^(.*)\.md$ $1 permanent;
  }
}

location / {
  try_files /$uri /$uri.html;
}

Alternatively, you can replace all md links with html using sed (taken from here ):或者,您可以使用sed (取自此处)将所有md链接替换为html

Change all internal file URLs from pointing to *.md links and instead point to the local *.html file将所有内部文件 URL 从指向 *.md 链接更改为指向本地 *.html 文件

  1. recursively run this sed command (programmatically replace FILENAME)递归运行此 sed 命令(以编程方式替换 FILENAME)

     sed -n -i.bak '/href="\./s/\.md/\.html/' FILENAME.html
  2. alternatively, run the following command instead (programmatically replace FILENAME)或者,改为运行以下命令(以编程方式替换 FILENAME)

     sed -e '/href="\./s/\.md/\.html/' FILENAME.html > FILENAME.html.tmp && mv FILENAME.html.tmp FILENAME.html`

I had a similar problem, so I made md_htmldoc .我遇到了类似的问题,所以我制作了md_htmldoc

It finds all of the .md files in a directory and then makes a separate directory where all the Markdown files has been converted to HTML.它在一个目录中找到所有.md文件,然后创建一个单独的目录,其中所有 Markdown 文件都已转换为 HTML。

It fixes hyperlinks (thanks to Sergio Correia's answer ).它修复了超链接(感谢Sergio Correia 的回答)。

It also gathers up any local file references so that links to images and such still work.它还收集任何本地文件引用,以便图像链接等仍然有效。

For anyone using a Makefile to drive conversion, here is a Makefile fragment that provides a rule transforming a .md into a .html with link adjusted:对于任何使用 Makefile 来驱动转换的人,这里有一个 Makefile 片段,它提供了将 .md 转换为 .html 并调整链接的规则:

SHELL=/bin/bash

%.html: %.md
    ( set -eu -o pipefail ; \
    pandoc -i $< -t html | \
    sed -E 's/<a href="([^"]*).md/<a href="\1.html/g' > $@.tmp && mv -vf $@.tmp $@ ; )

If test.md exists in current directory, make test.html will do it.如果当前目录中存在test.md ,则make test.html会执行此操作。

The rule also takes care of not clobbering an existing HTML file (whatever the reason) until the conversion actually succeeds.该规则还负责在转换实际成功之前不破坏现有的 HTML 文件(无论是什么原因)。

A slight modification to Sergio Correia's answer also catches anchor links in documents.Sergio Correia 的回答稍作修改也可以捕获文档中的锚链接。 Take care;小心; in some rare cases this might garble links...在极少数情况下,这可能会导致链接乱码...

import panflute as pf

def action(elem, doc):
    if isinstance(elem, pf.Link):
        if elem.url.endswith('.md'):
            elem.url = elem.url[:-3] + '.html'
            return elem
        elif elem.url.find('.md#'):
            elem.url = elem.url.replace('.md#', '.html#')
            return elem

if __name__ == '__main__':
    pf.run_filter(action)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM