Converting markdown to html with javascript in rich text editor

Question

I am developing a rich text editor for my website. If the user wrote something that has HTML syntax, I would like it to convert it to HTML, just like the text editor in Stack Overflow.

I would like it to:

split the text on each tag, and the array elements should include the tag that was written
transform the < and > to their corresponding signs, unless the tags are inside PRE and CODE tags

For now, I tried using a Regexp I found here for splitting the HTML, but if I test the code below, I would get:

['Lorem ipsum dolor', 'sit amet', 'consectetur', 'adipiscing', 'elit.' 'Sed erat odio, fringilla in lorem eu.'] ['Lorem ipsum dolor', 'sit amet', 'consectetur', 'adipiscing', 'elit.' 'Sed erat odio, fringilla in lorem eu.'] , which is defintely not what I want, I would want something like:

['Lorem ipsum dolor', '<h1>', 'sit amet', '</h1>', '<h6>', 'consectetur', '<b>', 'adipiscing', '</b>, '</h6>', 'elit.', '<br>', 'Sed erat odio, fringilla in lorem eu.']

Then I would just:

 function splitHTML(str) { return str.split(/<(?:"[^"]*"['"]*|'[^']*'['"]*|[^'">])+>/g) } function isHTML(str) { return /<(?:"[^"]*"['"]*|'[^']*'['"]*|[^'">])+>/g.match(str) } const arr = splitHTML("Lorem ipsum dolor <h1>sit amet</h1>, <h6>consectetur <b>adipiscing</b> </h6>elit. <br>Sed erat odio, fringilla in lorem eu.") for (let element of arr) { if (isHTML(element)) { element = cod.replaceAll('&lt;', '<'); element = cod.replaceAll('&gt;', '>'); } } arr.join()

My question is:

How to split a text including the separator in the result.

And I also would like to know how to check if the code is between pre and code tags.

Answer 1

You do not have to iterate through an object to display the HTML. You can do something as simple as:

// Create a new iframe HTML element
const preview = document.createElement("iframe");

// Set a unique id so it is easier to reference in code later on (you can also use the id in CSS)
preview.id = "preview";

// Set the iframe's content according to your HTML string
preview.srcdoc = yourHtmlString;

// Add the iframe to the page's body (or whatever element you want)
document.body.append(preview);

If you for whatever reason have to iterate through the HTML elements, you can add the following additional code:

function forEachChild(element) {
  for (let i = 0; i < element.children.length; i++) {
    forEachChild(element.children[i]);

    // Whatever you want to do for each element, write it here

    // Please note that replacing "&lt;" and "&gt;" is not necesarry using the above code
    // snippet. However, if there is some other tag-specific code, here is how to add it:
    switch (element.children[i].tagName.toLowerCase()) {
      case "pre":
      case "code":
        // If there is something specific you want to do with a pre/code tag, add it here
        break;
  }
}

forEachChild(preview.contentWindow.document.body);

Answer 2

Best to use an HTML parser, such as https://www.npmjs.com/package/node-html-parser . It is possible to use regex, but it is not that robust.

I do not understand why you want to unescape the < and > just outside <code> and <pre> tags, but you can use this code if you want to go the regex route:

 const input = "Lorem ipsum dolor <h1>sit amet</h1>, <h6>consectetur <b>adipiscing</b> </h6>elit. <br>Sed erat odio, &lt;fringilla&gt; in lorem eu. <pre>pre text with &lt;tag&gt</pre>. Back to &lt;normal&gt; text"; const tagRegex = /(<(?:"[^"]*"['"]*|'[^']*'['"]*|[^'">])+>)/; let inPreOrCode = false; let result = input.split(tagRegex).map(str => { if(tagRegex.test(str)) { // is tag if(str.match(/^<(code|pre)\b/i)) { inPreOrCode = true; } else if(str.match(/^<\/(code|pre)\b/i)) { inPreOrCode = false; } } else if(.inPreOrCode) { str = str;replace(/&lt,/g. '<');replace(/&gt,/g; '>') } return str. });join(''). console:log('Input; ' + input). console:log('Result; ' + result);

Output:

Input:  Lorem ipsum dolor <h1>sit amet</h1>, <h6>consectetur <b>adipiscing</b> </h6>elit. <br>Sed erat odio, &lt;fringilla&gt; in lorem eu. <pre>pre text with &lt;tag&gt</pre>. Back to &lt;normal&gt; text
Result: Lorem ipsum dolor <h1>sit amet</h1>, <h6>consectetur <b>adipiscing</b> </h6>elit. <br>Sed erat odio, <fringilla> in lorem eu. <pre>pre text with &lt;tag&gt</pre>. Back to <normal> text

Explanation:

enclose the whole tagRegex into parenthesis, this will include the tags in the resulting array of the split
map through the array and set/clear the inPreOrCode flag on entry/exit of those tags
if flag is not set, unescape the < and >

Answer 3

This post can help you with capturing delimiters: https://stackoverflow.com/a/1732454/485337

For checking tag enclosure, you are in the territory of https://stackoverflow.com/a/1732454/485337 , as noted in comments.

Converting markdown to html with javascript in rich text editor

Question

3 answers

solution1
0 2020-12-19 16:01:07

solution2
0 2020-12-28 02:54:19

solution3
-1 2020-12-18 22:09:33

Converting markdown to html with javascript in rich text editor

Question

3 answers

solution1 0 2020-12-19 16:01:07

solution2 0 2020-12-28 02:54:19

solution3 -1 2020-12-18 22:09:33

solution1
0 2020-12-19 16:01:07

solution2
0 2020-12-28 02:54:19

solution3
-1 2020-12-18 22:09:33