简体   繁体   中英

Parsing Malformed HTML with PHP Dom

I've got a client who wants their videos (provided by a third party) displayed on their web site. The web site uses swfobject to display the video, so I thought that it would be easiest to grab that and slightly modify it so that it works on the client's web site.

Using PHP DOMDocument seems the way to go, but unfortunately the HTML that is provided is malformed and causes a heart attack. Is it possible to get it to ignore the errors in the HTML, or an alternative way that I can do this?

This is what PHP Tidy is for. For example :

 <?php ob_start(); ?> <html>a html document</html> <?php $html = ob_get_clean(); // Specify configuration $config = array( 'indent' => true, 'output-xhtml' => true, 'wrap' => 200); // Tidy $tidy = new tidy; $tidy->parseString($html, $config, 'utf8'); $tidy->cleanRepair(); // Output echo $tidy; ?> 

See HTML Tidy Configuration Options .

If you like jQuery, you can use " Simple HTML Dom Parser ". It works great.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM