简体   繁体   English

如果href或src的开头不是http,https或www,则PHP删除特定的标记或img标记

[英]PHP-Remove specific a tag or img tag if href or src does not start with http, https, or www

I want to remove specific a and img tags from the $string_1 if <src> or <href> does not start with www, http or https inside <a> or <img> tags. 如果<src><href><a><img>标记内不以www,http或https开头,我想从$string_1删除特定的aimg标记。

For example, $string_1 is converted to $string_2 by removing: 例如,通过删除以下内容,将$string_1转换为$string_2

<img src="/wp-content/uploads/2014/06/photography-business-2.jpg" alt="photography business growth 1 650x430 6 Simple Ways To Help Grow Your Photography Business" width="650" height="430" class="alignnone size-large wp-image-609513" title="6 Simple Ways To Help Grow Your Photography Business"/>

and

<a href="/photography-business-growth/" rel="nofollow">Read more about Photography Business Growth &gt;</a>

because the src and href tags do not start with http, https or www. 因为srchref标签不是以http,https或www开头。

$string_1 = '
<div class="mainpost"><p><img src="/wp-content/uploads/2014/06/photography-business-2.jpg" alt="photography business growth 1 650x430 6 Simple Ways To Help Grow Your Photography Business" width="650" height="430" class="alignnone size-large wp-image-609513" title="6 Simple Ways To Help Grow Your Photography Business"/></p>
<div class="mainpost"><p><img src="http://www.domain.com/wp-content/uploads/2014/06/photography-business-2.jpg" alt="photography business growth 1 650x430 6 Simple Ways To Help Grow Your Photography Business" width="650" height="430" class="alignnone size-large wp-image-609513" title="6 Simple Ways To Help Grow Your Photography Business"/></p>
<p><a href="http://domain.com/photography-business-growth/" rel="nofollow">Read more about Photography Business Growth &gt;</a></p>
<p>Photography Business Growth | With a world wide recession, photographers and small business owners are forced, more than ever, to think creatively, to think differently and outside of the box. With very little or no money to invest in your business, can you move forward? How can you build your brand and make sure to get happier, paying clients through your door?<br/><span id="more-609494"/></p>
<p>If you take good shots it doesn’t mean you’ll gain success and popularity among customers. For those of you who have survived start=up and built successful brands, you may be wondering which step to take next to grow your business beyond its current status. There are numerous possibilities, some of which we’ll outline here. You need to know how to sell yourself well! Everything is quite simple and you can do it yourself.</p>
<p><a href="/photography-business-growth/" rel="nofollow">Read more about Photography Business Growth &gt;</a></p>
';

$string_2= '
<div class="mainpost"><p></p>
<div class="mainpost"><p><img src="http://www.domain.com/wp-content/uploads/2014/06/photography-business-2.jpg" alt="photography business growth 1 650x430 6 Simple Ways To Help Grow Your Photography Business" width="650" height="430" class="alignnone size-large wp-image-609513" title="6 Simple Ways To Help Grow Your Photography Business"/></p>
<p><a href="http://domain.com/photography-business-growth/" rel="nofollow">Read more about Photography Business Growth &gt;</a></p>
<p>Photography Business Growth | With a world wide recession, photographers and small business owners are forced, more than ever, to think creatively, to think differently and outside of the box. With very little or no money to invest in your business, can you move forward? How can you build your brand and make sure to get happier, paying clients through your door?<br/><span id="more-609494"/></p>
<p>If you take good shots it doesn’t mean you’ll gain success and popularity among customers. For those of you who have survived start=up and built successful brands, you may be wondering which step to take next to grow your business beyond its current status. There are numerous possibilities, some of which we’ll outline here. You need to know how to sell yourself well! Everything is quite simple and you can do it yourself.</p>
';

Could you please help me to solve this problem? 您能帮我解决这个问题吗? Thanks 谢谢

Here is a first approach in PHP. 这是PHP中的第一种方法。 It works for your example data. 它适用于您的示例数据。 In $string_2 was the tailing "<p></p>" missing. $ string_2中缺少尾部的“ <p> </ p>”。

$string_3 = $string_1;
$pattern = "([^wh]|w[^w]|ww[^w]|h[^t]|ht[^t]|htt[^p])";
$string_3 = preg_replace("/<img src=\"".$pattern."[^>]*>/","",$string_3);
$string_3 = preg_replace("/<a href=\"".$pattern."[^>]*>[^<]*<\/a>/","",$string_3);

I would use a DOM parser for this. 我将为此使用DOM解析器。 Having a DOM document, you can use XPath to select the desired elements. 有了DOM文档,您可以使用XPath选择所需的元素。

# Parse the HTML snippet into a DOM document
$doc = new DOMDocument();
$doc->loadHTML($string_1);

# Create an XPath selector
$selector = new DOMXPath($doc);

# Define the XPath query
# The syntax highlighter messed this up. Take it as it is!
$query = <<<EOF
  //a[not(starts-with(@href, "http"))
  and not(starts-with(@href, "www"))]
| //img[not(starts-with(@src, "http"))
  and not(starts-with(@src, "www"))]
EOF;

# Issue the XPath query and remove every resulting node
foreach($selector->query($query) as $node) {
    $node->parentNode->removeChild($node);
}

# Write back the modified `<div>` element into a string
echo $doc->saveHTML(
    $selector->query('//div[@class="mainpost"]')->item(0)
);

One solution would be doing this on the front-end with Javascript. 一种解决方案是使用Javascript在前端进行此操作。 If that's not an option, you can look into a PHP library to parse and traverse the DOM, such as http://simplehtmldom.sourceforge.net 如果不是这种选择,则可以查看PHP库以解析和遍历DOM,例如http://simplehtmldom.sourceforge.net

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM