简体   繁体   中英

Replace non-html and replace it with correct syntax

I have a source program that delivers text with non html tags and incorrect syntax. for example:

the <H>quick</> brown fox.
the <U>quick</> brown fox.
<H><U>The</> quick brown fox.
<H><U>The</> quick </> brown fox.

The out come should be someting like:

the quick brown fox.

the quick brown fox.

The quick brown fox.

The quick brown fox.

So the tags used are not html-valid, but also not closed as they should. I'm struggling to get this working in javascript.

started with something like:

var s = document.getElementById('root').innerHTML;
s = s.replace("&lt;H&gt;", "<b>");  
s = s.replace("&lt;h&gt;", "<b>");    
s = s.replace("&lt;/&gt;","</b>");   
document.getElementById('root').innerHTML = s;

root is the all containing div. The tags will appier in a div with class "label components", there will be multiple divs with class "label components" (and thus multiple times the incorrect tags on a page).

how can I best tackle this?

Probably easiest to write a small parser/processor that uses a stack to keep track of the tags that still need to be closed:

 const s1 = 'the <H>quick</> brown fox.'; const s2 = 'the <U>quick</> brown fox.'; const s3 = '<H><U>The</> quick brown fox.'; const s4 = '<H><U>The</> quick </> brown fox.'; const process = (s) => { const map = {'H': 'b', 'U': 'i'}; const stack = []; return s.replace(/<([AZ/])>/g, (_, t) => { if (map[t]) { stack.push(map[t]); return `<${map[t]}>`; } else { return `</${stack.pop()}>`; } }); }; console.log(process(s1)); console.log(process(s2)); console.log(process(s3)); console.log(process(s4));

Your third example still comes out to be invalid HTML, because of the fact that the number of opening and closing tags doesn't match. If that's more than just a mistake in your example, you'll be looking at a more complex solution, and would need to specify what the desired behavior is supposed to be.

In my opinion the best route in this case would be to open html document in editor like VS Code and use find and replace tool . You should also use HTMLHint exten sion for VS Code to highlight all the problems in html document.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM