简体   繁体   中英

PHP Simple HTML DOM Parser: How to remove <font> tags from script output?

I'm using PHP Simple HTML DOM Parser to extract a list of URLs from a page as follows:

<?php
include('simple_html_dom.php');
$url = 'http://www.domain.com/';
$html = file_get_html($url);
foreach($html->find('table[width=370]') as $table)
    {
    foreach($table->find('a') as $item)
        echo $item->outertext . '<br><hr>';
    }
$html->clear();
?>

It works just fine insofar as it extracts the required information, however, some of the a tags (on domain.com) are formatted like this:

<a href="http://www.domain.com"><font size="2">Anchor text</font></a>

Whereas, in others, the font size is defined in the p tag that contains each a tag, meaning the a tag is displayed as:

<a href="http://www.domain.com">Anchor text</a>

Is there any way to strip out the font tag from those a tags that have it? It's probably very simple, but I've been 'running around in rings' for ages trying to do it :(

Thanks for any ideas or suggestions you might have.

Tom.

strip_tags() maybe?

If you only want to allow the a tag, just use:

echo strip_tags($item->outertext, 'a');

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM