简体   繁体   中英

Preg_replace wont remove each img tag with src address

I am working on my PHP to search for the images that come the specific address so I want to remove all of these img tags.

I have the img tags which show like this:

<img src="http://example.com/someimage1.jpeg">
<img src="http://example.com/someimage2.jpeg">
<img src="http://example.com/someimage3.jpeg">
<img src="http://example.com/someimage4.jpeg">
<img style="OVERFLOW: hidden; WIDTH: 0px; MAX-HEIGHT: 0px" alt="" src="http://test.mydomain.com/project433q325/track/Images/signature.gif?id=446&amp;etc=1586624376">

When I try this:

foreach ($src as $image) {
    $image = preg_replace("\<img src\=\"(.+)\"(.+)\/\>/i", '', $src);
}

It will not remove the img tag, so I have also tried this:

foreach ($src as $image) {
    $image = preg_replace("/<img[^>]+\>/i", "", $src); 
}

I still have the same issue as it will not remove the img tag.

Here is the full code:

if (strpos($inbox_message, 'http://test.mydomain.com/project433q325/track/Images/signature.gif?') !== false) {
    $doc = new DOMDocument();
    $doc->loadHTML($inbox_message);
    $xpath = new DOMXpath($doc);
    $src = $xpath->evaluate("string(//img/@src)");

    if ($src) {
        foreach ($src as $image) {
            //image->nodeValue = preg_replace('<img.*?src='.$src.'.*?/>!i', '', $src);
            //$src = preg_replace("/<img[^>]+\>/i", "", $src);
            $image = preg_replace("\<img src\=\"(.+)\"(.+)\/\>/i", '', $src);
        //}
    }
    $inbox_message = $doc->saveHTML();
} 

What I am trying to do is I only want to search for the img tags that have the src address which show ' http://test.mydomain.com/project433q325/track/Images/signature.gif ?' and remove them.

Can you please show me an example how I can search for each img tag that have specific src address so I can remove each img tags using preg_replace?

Thank you.

EDIT: Here is the $inbox_message variable:

$inbox_message = '<img src="http://example.com/someimage1.jpeg"><img src="http://example.com/someimage2.jpeg"><img src="http://example.com/someimage3.jpeg"><img src="http://example.com/someimage4.jpeg"><img style="OVERFLOW: hidden; WIDTH: 0px; MAX-HEIGHT: 0px" alt="" src="http://test.mydomain.com/project433q325/track/Images/signature.gif?id=446&amp;etc=1586624376">';

You should not use a regex for this. You can use your strpos as you were but move it inside the DOM parsing and compare each img . You can then use removeChild() to remove the appropriate images. (This is an adapted answer from How to delete element with DOMDocument? )

<?php
$inbox_message = '<p> Keep This</p><img src="http://example.com/someimage1.jpeg"><img src="http://example.com/someimage2.jpeg"><img src="http://example.com/someimage3.jpeg"><img src="http://example.com/someimage4.jpeg"><h1>Fake element</h1><img style="OVERFLOW: hidden; WIDTH: 0px; MAX-HEIGHT: 0px" alt="" src="http://test.mydomain.com/project433q325/track/Images/signature.gif?id=446&amp;etc=1586624376">';
$doc = new DOMDocument();
$doc->loadHTML($inbox_message);
$imgs = $doc->getElementsByTagName('img');
for($i = $imgs->length; --$i >= 0;){
    $node = $imgs->item($i);
    if (strpos($node->getAttribute('src'), 'http://test.mydomain.com/project433q325/track/Images/signature.gif?') !== false) {
        $node->parentNode->removeChild($node);
    }
}
echo $doc->savehtml();

https://3v4l.org/qinLR

You also could use strtolower if $node->getAttribute('src') might contain varying case. The needle for strpos should also be lowercased in that case.

For regex issues...

preg_replace("\<img src\=\"(.+)\"(.+)\/\>/i", '', $src);

The start of the regex is attempting to use a backslash which is not a valid delimiter . A delimiter can be any non-alphanumeric, non-backslash, non-whitespace character. The starting delimiter must match the ending delimiter. Additionally your $src only contained the value of the attribute so <img src... would never match.

If you were to get that functioning the .+ would need to be replaced with the URI you wanted to check against.

BUT regex really is the wrong approach here. Use a parser, as you were, for these type of jobs. Regex shouldn't be used for structured data. If it is structured there likely already are functions written for it.

To remove all img tags, use the following regex pattern:

<img\s+[^>]+>

https://regex101.com/r/HfStzZ/1


To include the specific src url as you described in your question, use the following regex pattern:

<img\s+[^>]*\bsrc="[^"]*\/signature\.gif[^\>]*\>

https://regex101.com/r/HfStzZ/2


In PHP, use the preg_replace command as follows:

$output = preg_replace('/<img\s+[^>]*\bsrc="[^"]*\/signature\.gif[^\>]*\>/', '', $input);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM