domdocument如何获取信息和imgs

Question

<?php 
$htmlget = new DOMDocument();

@$htmlget->loadHtmlFile(http://www.amazon.com);

$xpath = new DOMXPath( $htmlget);
$nodelist = $xpath->query( "//img/@src" );

foreach ($nodelist as $images){
    $value = $images->nodeValue;
}
?>

我得到了所有的图像，但我如何获得关于图像所在元素的信息？ 例如，在 amazon.com 上，有一个 kindle。 我现在有图片，但需要价格说明等信息...谢谢

Answer 1

这取决于所请求页面的标记，这里是在亚马逊上获取价格的示例：

<?php
       $htmlget = new DOMDocument();

       @$htmlget->loadHtmlFile('http://www.amazon.com');

       $xpath = new DOMXPath( $htmlget);
       $nodelist = $xpath->query( "//img/@src" );

        foreach ($nodelist as $imageSrc){

      //fetch images with a parent node that has class "imagecontainer"
      if($imageSrc->parentNode->parentNode->getAttribute('class')=='imageContainer')
      {
        //skip dummy-images
        if(strstr($imageSrc->nodeValue,'transparent-pixel'))continue;

        //point to the common anchestor of image and product-details
        $wrapper=$imageSrc->parentNode->parentNode->parentNode->parentNode->parentNode;

        //fetch the price
        $price=$xpath->query( 'span[@class="red t14"]',$wrapper );
        if($price->length )
        {
           echo '<br/><img src="'.$imageSrc->nodeValue.'">'.$price->item(0)->nodeValue.'<br/>';
        };
      }
}
?>

但是，您不应该那样解析页面。 如果他们想为您提供一些信息，通常有 API。 如果不是，他们不想让你抢任何东西。 以这种方式解析是不可靠的，所请求页面的标记可以每秒更改一次（您也可能为漏洞打开一扇门）。 它也可能不合法。

domdocument如何获取信息和imgs

问题描述

1 个解决方案

解决方案1
1 已采纳 2011-05-24 11:21:52

domdocument如何获取信息和imgs

问题描述

1 个解决方案

解决方案1 1 已采纳 2011-05-24 11:21:52

解决方案1
1 已采纳 2011-05-24 11:21:52