简体   繁体   English

在富文本编辑器中使用 javascript 将 markdown 转换为 html

[英]Converting markdown to html with javascript in rich text editor


I am developing a rich text editor for my website. 我正在为我的网站开发富文本编辑器。 If the user wrote something that has HTML syntax, I would like it to convert it to HTML, just like the text editor in Stack Overflow. 如果用户编写了具有 HTML 语法的内容,我希望它将其转换为 HTML,就像 Stack Overflow 中的文本编辑器一样。

I would like it to:我希望它:

  1. split the text on each tag, and the array elements should include the tag that was written拆分每个标签上的文本,数组元素应包含写入的标签
  2. transform the <变换< and >> to their corresponding signs, unless the tags are inside PRE and CODE tags到它们相应的标志,除非标签在 PRE 和 CODE 标签内

For now, I tried using a Regexp I found here for splitting the HTML, but if I test the code below, I would get:现在,我尝试使用我在这里找到的正则表达式来拆分 HTML,但如果我测试下面的代码,我会得到:

['Lorem ipsum dolor', 'sit amet', 'consectetur', 'adipiscing', 'elit.' 'Sed erat odio, fringilla in lorem eu.'] ['Lorem ipsum dolor', 'sit amet', 'consectetur', 'adipiscing', 'elit.' 'Sed erat odio, fringilla in lorem eu.'] , which is defintely not what I want, I would want something like: ['Lorem ipsum dolor', 'sit amet', 'consectetur', 'adipiscing', 'elit.' 'Sed erat odio, fringilla in lorem eu.'] ,这绝对不是我想要的,我想要类似的东西:

['Lorem ipsum dolor', '<h1>', 'sit amet', '</h1>', '<h6>', 'consectetur', '<b>', 'adipiscing', '</b>, '</h6>', 'elit.', '<br>', 'Sed erat odio, fringilla in lorem eu.']

Then I would just:然后我会:

 function splitHTML(str) { return str.split(/<(?:"[^"]*"['"]*|'[^']*'['"]*|[^'">])+>/g) } function isHTML(str) { return /<(?:"[^"]*"['"]*|'[^']*'['"]*|[^'">])+>/g.match(str) } const arr = splitHTML("Lorem ipsum dolor <h1>sit amet</h1>, <h6>consectetur <b>adipiscing</b> </h6>elit. <br>Sed erat odio, fringilla in lorem eu.") for (let element of arr) { if (isHTML(element)) { element = cod.replaceAll('&lt;', '<'); element = cod.replaceAll('&gt;', '>'); } } arr.join()

My question is:我的问题是:

How to split a text including the separator in the result.如何拆分结果中包含分隔符的文本。

And I also would like to know how to check if the code is between pre and code tags.而且我也想知道如何检查代码是否在precode标签之间。

You do not have to iterate through an object to display the HTML.您不必遍历 object 即可显示 HTML。 You can do something as simple as:你可以做一些简单的事情:

// Create a new iframe HTML element
const preview = document.createElement("iframe");

// Set a unique id so it is easier to reference in code later on (you can also use the id in CSS)
preview.id = "preview";

// Set the iframe's content according to your HTML string
preview.srcdoc = yourHtmlString;

// Add the iframe to the page's body (or whatever element you want)
document.body.append(preview);

If you for whatever reason have to iterate through the HTML elements, you can add the following additional code:如果您出于某种原因必须遍历 HTML 元素,您可以添加以下附加代码:

function forEachChild(element) {
  for (let i = 0; i < element.children.length; i++) {
    forEachChild(element.children[i]);

    // Whatever you want to do for each element, write it here

    // Please note that replacing "&lt;" and "&gt;" is not necesarry using the above code
    // snippet. However, if there is some other tag-specific code, here is how to add it:
    switch (element.children[i].tagName.toLowerCase()) {
      case "pre":
      case "code":
        // If there is something specific you want to do with a pre/code tag, add it here
        break;
  }
}

forEachChild(preview.contentWindow.document.body);

Best to use an HTML parser, such as https://www.npmjs.com/package/node-html-parser .最好使用 HTML 解析器,例如https://www.npmjs.com/package/node-html-parser It is possible to use regex, but it is not that robust.可以使用正则表达式,但它不是那么健壮。

I do not understand why you want to unescape the &lt;我不明白你为什么要取消&lt; and &gt;&gt; just outside <code> and <pre> tags, but you can use this code if you want to go the regex route:就在<code><pre>标签之外,但如果你想 go 正则表达式路线,你可以使用这个代码:

 const input = "Lorem ipsum dolor <h1>sit amet</h1>, <h6>consectetur <b>adipiscing</b> </h6>elit. <br>Sed erat odio, &lt;fringilla&gt; in lorem eu. <pre>pre text with &lt;tag&gt</pre>. Back to &lt;normal&gt; text"; const tagRegex = /(<(?:"[^"]*"['"]*|'[^']*'['"]*|[^'">])+>)/; let inPreOrCode = false; let result = input.split(tagRegex).map(str => { if(tagRegex.test(str)) { // is tag if(str.match(/^<(code|pre)\b/i)) { inPreOrCode = true; } else if(str.match(/^<\/(code|pre)\b/i)) { inPreOrCode = false; } } else if(.inPreOrCode) { str = str;replace(/&lt,/g. '<');replace(/&gt,/g; '>') } return str. });join(''). console:log('Input; ' + input). console:log('Result; ' + result);

Output: Output:

Input:  Lorem ipsum dolor <h1>sit amet</h1>, <h6>consectetur <b>adipiscing</b> </h6>elit. <br>Sed erat odio, &lt;fringilla&gt; in lorem eu. <pre>pre text with &lt;tag&gt</pre>. Back to &lt;normal&gt; text
Result: Lorem ipsum dolor <h1>sit amet</h1>, <h6>consectetur <b>adipiscing</b> </h6>elit. <br>Sed erat odio, <fringilla> in lorem eu. <pre>pre text with &lt;tag&gt</pre>. Back to <normal> text

Explanation:解释:

  • enclose the whole tagRegex into parenthesis, this will include the tags in the resulting array of the split将整个 tagRegex 括在括号中,这将包括拆分结果数组中的标签
  • map through the array and set/clear the inPreOrCode flag on entry/exit of those tags map 通过数组设置/清除这些标签进入/退出时的inPreOrCode标志
  • if flag is not set, unescape the &lt;如果未设置标志,则取消转义&lt; and &gt;&gt;

This post can help you with capturing delimiters: https://stackoverflow.com/a/1732454/485337这篇文章可以帮助您捕获分隔符: https://stackoverflow.com/a/1732454/485337

For checking tag enclosure, you are in the territory of https://stackoverflow.com/a/1732454/485337 , as noted in comments.如评论中所述,要检查标签外壳,您位于https://stackoverflow.com/a/1732454/485337的范围内。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM