简体   繁体   English

使用PHP DOMDocument和DOMXPath更新HTML中的链接

[英]Update links in HTML using PHP DOMDocument and DOMXPath

So I need to update all image links in some HTML. 因此,我需要更新某些HTML中的所有图像链接。 Let's say my HTML looks like this: 假设我的HTML如下所示:

<html>
    <body>
        <div class="content">
            <p><a href="example-1.html">This</a> is a normal link. I don't want to change this link.</p>
            <p>But this is an image link: <a href="example-1.html"><img src="http://fpoimg.com/150"></a></p>
        </div>
    </body>
</html>

I'm trying to replace all image links, that is links that contain just an image, to example-2.html, using PHP's DOMDocument and DOMXPath. 我试图使用PHP的DOMDocument和DOMXPath将所有图像链接(即仅包含图像的链接)替换为example-2.html。

Here's the code I have so far: 这是我到目前为止的代码:

$dom = new DOMDocument();
$dom->loadHTML( $content );
$imgs = $dom->getElementsByTagName("img");
foreach ($imgs as $img) {
    $parent = $img->parentNode;
}

I'm not sure if it's faster to grab all of the images and then check the parent, or grab all of the links and then check the child. 我不确定抓取所有图像然后检查父级,或抓取所有链接然后检查子级是否更快。 I think there would be more regular text links on the page than images, so I think the former would be faster. 我认为页面上的常规文本链接将比图像更多,因此我认为前者会更快。

I'm just not sure where to go from here. 我只是不确定从这里去哪里。

As you actually want to update a tags, you should try to use an XPath that leads you to those a tags that contain an img tag. 当你确实需要更新a标签,你应该尝试使用导致你这些的XPath a标签包含一个img标记。 This is possible with the following XPath and code: 使用以下XPath和代码可以实现:

$dom = new DOMDocument();
$dom->loadHTML( $content );
$xpath = new DOMXPath( $dom );
$anchor_list = $xpath->query( "//a[img[@src]]" );
foreach($anchor_list as $a) {
    $url = $a->getAttribute('href');
    // modify url ...
    $url = str_replace("this", "that", $url);
    $a->setAttribute('href', $url);
}
$content = $dom->saveHTML(  );
echo $content;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM