简体繁体 English

使用 JavaScript 将 Markdown 转换为 HTML - 限制支持的语法

[英]Converting markdown to HTML with JavaScript - restricting sppported syntax

原文 2018-11-19 09:17:10 7 2 javascript/ html/ syntax/ markdown

I am using marked.js currently to convert markdown to HTML, so the users of my Web-App can create a structured content.我目前正在使用marked.js将markdown转换为HTML，因此我的Web 应用程序的用户可以创建结构化的内容。 I am wondering if there is a way to restrict the supported syntax tu just an sub-set, like我想知道是否有办法将支持的语法限制为一个子集，例如

headers标题

italic text斜体文字

bold text粗体

lists with only 1 depth of indentation仅具有 1 个缩进深度的列表

quotes引号

I would like to prohibit conversion of list with multiple levels of indentation, code blocks, headers in lists ...我想禁止在列表中转换具有多级缩进、代码块、标题的列表......

The reason is, that my WebApp should the users to create content in a specific way and if there will be possibility create some crazy structured content (list of headers, code in headers, lists of images ...) someone will for sure do it.原因是，我的 WebApp 应该让用户以特定方式创建内容，如果有可能创建一些疯狂的结构化内容（标题列表、标题中的代码、图像列表......），肯定会有人这样做.

2 个解决方案

You have a few difference options:您有几个不同的选择：

Marked.js uses a multi-step method to parse Markdown. Marked.js 使用多步方法来解析 Markdown。 It uses a lexer, which breaks the document up into tokens, a parser to convert those tokens to a abstract syntax tree (AST) and a renderer to convert the AST to HTML.它使用词法分析器将文档分解为标记，使用解析器将这些标记转换为抽象语法树 (AST)，并使用渲染器将 AST 转换为 HTML。 You can override any of those pieces to alter the handling of various parts of the syntax.您可以覆盖任何这些部分来改变对语法各个部分的处理。

For example, if you simply wanted to ignore lists and leave them out of the rendered HTML, replace the list function from the renderer with one which returns an empty string.例如，如果您只是想忽略列表并将它们排除在渲染的 HTML 之外，请将渲染器中的list函数替换为返回空字符串的函数。

Or, if you want the parser to act as if lists are not even a supported feature of Markdown, you could remove the list and listitem methods from the parser.或者，如果您希望解析器表现得好像列表甚至不是 Markdown 支持的功能，您可以从解析器中删除list和listitem方法。 In that case, the list would remain in the output, but would be treated as a paragraph instead.在这种情况下，列表将保留在输出中，但会被视为段落。

Or, if you want to support one level of lists, but not nested lists, then you could replace the list and/or listitem methods in the parser with your own implementation that parses lists as you desire.或者，如果你想支持列表中的一个级别，但不是嵌套列表，那么你可以更换list和/或listitem与自己的实现在分析器中的方法解析列表作为你的愿望。

Note that there are also a number advanced options , which use the above methods to alter the parser and/or render in various ways.请注意，还有许多高级选项，它们使用上述方法以各种方式更改解析器和/或渲染。 For the most part, those options would not provide the features you are asking for, but browsing though the source code might give you some ideas of how to implement your own modifications.大多数情况下，这些选项不会提供您所要求的功能，但浏览源代码可能会给您一些关于如何实现自己的修改的想法。

However, there is the sanitize option, which will accept a sanitizer function.但是，有sanitize选项，它将接受sanitizer功能。 You could provide your own sanitizer which removed any unwanted elements from the HTML output.您可以提供自己的消毒剂，从 HTML 输出中删除任何不需要的元素。 This would result in a similar end result to overriding the renderer, but would be implemented differently.这将导致与覆盖渲染器类似的最终结果，但会以不同的方式实现。 Depending on what you want to accomplish, one or the other may be more effective.根据您想要完成的任务，其中一个可能更有效。

Another possibility would be to use Commonmark.js , parse the input ant than walk the parsed tree and remove all nodes with/without specific type.另一种可能性是使用Commonmark.js ，解析输入蚂蚁而不是遍历解析树并删除所有具有/不具有特定类型的节点。 See this example , it worked fine for images, but failed for code blocks.看这个例子，它对图像工作正常，但对代码块失败。

Downside of this approach is, that the parsed markdown source will be traversed two-times: one time for editing and second time for rendering.这种方法的缺点是，解析的降价源将被遍历两次：一次用于编辑，第二次用于渲染。