How can I get all the unclosed tags in a given string, prefferably in the order they should be closed?
Note: consider that there are no errors in the HTML and that it was just cut off after X characters. No it's not a case of bad html or overlapping tags etc. Also there will be no ending
Example: <p><span>Lorem</span><b>ipsum
---return---> </b></p>
-OR-
<ul><li>1</li><li>2
---return---> </li></ul>
So that if the string is concatenated with the function output it will re-create a valid HTML.
I'm not sure if a RegExp would do the trick here, basically I want to get anything that's between < and > that does not have a matching </ > closing tag.
Thank you.
This is not an easy task. You might want to look at Tidy :
Tidy is a binding for the Tidy HTML clean and repair utility which allows you to not only clean and otherwise manipulate HTML documents, but also traverse the document tree.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.