简体   繁体   中英

PHP get all unclosed HTML tags in string

How can I get all the unclosed tags in a given string, prefferably in the order they should be closed?

Note: consider that there are no errors in the HTML and that it was just cut off after X characters. No it's not a case of bad html or overlapping tags etc. Also there will be no ending

Example: <p><span>Lorem</span><b>ipsum ---return---> </b></p>
-OR-
<ul><li>1</li><li>2 ---return---> </li></ul>

So that if the string is concatenated with the function output it will re-create a valid HTML.

I'm not sure if a RegExp would do the trick here, basically I want to get anything that's between < and > that does not have a matching </ > closing tag.

Thank you.

This is not an easy task. You might want to look at Tidy :

Tidy is a binding for the Tidy HTML clean and repair utility which allows you to not only clean and otherwise manipulate HTML documents, but also traverse the document tree.

http://php.net/manual/en/book.tidy.php

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM