简体   繁体   中英

using xpath to extract data by introducing ancestor in xpath query

i am using following code

$doc = new DOMDocument();
$doc->strictErrorChecking = false;
@$doc->loadHTML($data);
  $xpath = new DOMXPath($doc);
 //Select the parent node
$categories =$xpath->query('//span[@class="refinementLink"]/ancestor::a/li/ul');
$abcd=array();
var_dump($categories);
foreach ($categories as $category) {


    $abcd[]=$category->nodeValue; 
      print_r('<br/>'.$abcd);
    // Crafts, Hobbies & Home (19)
}` //var_dump($abcd);

now , what this code do? it selects a span tag, dom listing of span tag is

ul--li(4)--a(2)--span(3)

the output is

object(DOMNodeList)[3]

it looks like i am doing thing okay, there are 3 span tags in my html document, what i need is , how i can get the text of these span tag?i need the text between the span tags any help?

->textContent

foreach ($categories as $category) {
    $abcd[]=$category->textContent; 
}
var_dump($abcd);

I'm thinking you can probably pull the @attribute at the start when you do the XPath query. Predicates in XPath handle the foreach for you.

I use XML developer from Oxygen IDE, which works pretty well to show what XPath parses out of XML so you can be more certain about what to expect.

//span/@text[../@class="refinementLink"]/ancestor::a/li/ul I am not sure if text is your target text's attribute but in XPath, whatever right before [] is about what you want to select. You chose it to be a node, so you had to do additional work there. If you pull out a sequence of Strings instead, you might get something else. I never tried it myself, just offering an alternative thought.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM