简体   繁体   中英

Remove <p>&nbsp;</p> with DOM or regex

How can I remove this type p tag <p>&nbsp;</p> using DOM or regex?

I want to remove multiple p like this too,

<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>

If you want to remove a string that is exactly, always, '<p>&nbsp;</p>' , the simplest and fastest solution is probably to use str_replace() :

$new_string = str_replace('<p>&nbsp;</p>', '', $old_string);

I don't think it's necessary to use DOM for such a simple case -- and a regex is not necessary here.


Of course, if you need to replace something more complex, that is not always exactly the same string... well, it'll be time for DOM manipulations ;-)

preg_replace("|<p>&nbsp;</p>|", "", "<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>");

In case you would like to do that with xpath (your example is just demanding str_replace however), you can query the &nbsp entity as a string ( Demo ):

$html = '<body><p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>Not empty :)</p>
</body>';

$dom = new DomDocument();
$dom->loadhtml($html);
$xpath = new DomXPath($dom);
$col = $xpath->query("//p[text()=\"\xC2\xA0\"]"); # &nbsp;
foreach($col as $e) {
    $e->parentNode->removeChild($e);
}
echo $dom->saveXML($dom->getElementsByTagName('body')->item(0));

Hope this is helpful if you need to query &nbsp; with xpath.

See as well: Using XPATH to search text containing

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM