I need to remove all HTML tags except:
<sub>
tag var str = "something1
<sub>
something2
<div class='myclass'>something3</div>
</sub>
<div class='myclass'>something4</div>
something5
<div class='myclass'>something6</div>
<div class='myclass'>something7</div>
`<div>something8</div>`
something9";
Expected output:
/*
something1
<sub>
something2
something3
</sub>
something4
something5
<div class='myclass'>something6</div>
`<div>something8</div>`
something9
Here is what I've tried so far:
/\n\s{0,3}<.*[^>]+|<sub>.*?<\/sub>|`.*?`/gm
This is possible with regex substitutions. Use this regex with mg
modifiers:
(\n\n .*|`[^`]+`|<\/?sub\b[^>]+>)|<[^>]+>
And use $1
as the substitution.
There are several parts to this. The capturing group finds all the HTML you may want to keep:
\\n\\n .*
An empty line, and another line that starts with 4 spaces. `[^`]+`
Things in Back`Ticks
. <\\/?sub\\b[^>]+>)
This matches sub
HTML elements, opening or closing. The remaining HTML elements will match <[^>]+>
, which is discarded.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.