简体   繁体   中英

DomDocument/DOMXPath - How to get HTML Dom element by itemprop and img src

I am working on a script which is getting data from HTML DOM elements.

Here is my code:

$url = 'http://www.sportsdirect.com/nike-satire-mens-skate-shoes-242188?colcode=24218822';
libxml_use_internal_errors(true); 
$doc = new DOMDocument();
$doc->loadHTMLFile($url);

$xpath = new DOMXpath($doc);

$Name = $xpath->query('//span[@id="ProductName"]')->item(0)->nodeValue;

echo $Name;

This code is simply taking the text inside <span id="ProductName"></span> . I know how to get the data from elements with specific class or id.

I don't know how I can get the src="http://adres-to-image.com/img.png" (pure example) from image tag or how I can get elements which do not have id or class but have attribute like itemprop , for example <div itemprop="name"></div>

  1. How can I get the image src ?
  2. How can I get elements with itemprop ?

For your examples:

$xpath->query('//img/@src)->item(0)->nodeValue

This means

Select all src attributes of all img tags and get the value of the first

$xpath->query('//div/[@itemprop="name"])->item(0)->nodeValue

This means

Select all divs with itemprop attr equals name and get the value of the first.

You just look for the attributes:

$url = 'http://www.sportsdirect.com/nike-satire-mens-skate-shoes-242188?colcode=24218822';
libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTMLFile($url);

$xpath = new DOMXpath($doc);

$Name = $xpath->query('//div[@class="productImageSash"]');
foreach($Name as $element){
    $imgs = $element->getElementsByTagName('img');
    foreach($imgs as $img){
        $src = $img->getAttribute('src');
        echo $src;
    }

}

Output:

/images/sash/productsash_mustgo.png 

The same with itemprop attribute, look for divs which have this attribute:

$Name = $xpath->query('//div');
foreach($Name as $element){
    $itemprop = $element->getAttribute('itemprop');
    if($itemprop){
        echo "found";
    }

}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM