简体   繁体   中英

Need regex help in PHP 5

Ok. Admittedly, I am not the best at working with regular expressions. What I am doing is a screen scrape, then trying to fix the img src values in the embedded images to point back to the original domain. This is the regex I have been trying variations of (too many to list - here's the current one):

preg_match_all('/<img\b[^>]*>/i', $html, $images);  

What this ends up doing is to replace all < with /> . What I need it to do is just return the (currently) five images on the page in an array so that I can work with those to fix their src values, then write them back to $html, which is set at the beginning of the file:

$html = file_get_contents($target_url);

Basically, don't do this with regex. You can parse HTML with regex, but it is almost certainly not worth the effort.

Do it with genuine DOM parsing instead, using the DOMDocument class:

$dom = new DOMDocument;
$dom->loadHTML($html);
$images = $dom->getElementsByTagName('img');
foreach ($images as $image) {
    $image->setAttribute('src', 'http://example.com/' . $image->getAttribute('src'));
}
$html = $dom->saveHTML();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM