Xpath - get text from all h1, h3 p tags within a div

Question

I'm currently using the queries below to extract the text within the <h1> <p> and <h3> tags.

$xpath->query("//div[contains(concat(' ', normalize-space(@class), ' '), ' grid_9 alpha omega newscontainer arena ')]/h1");
$xpath->query("//div[contains(concat(' ', normalize-space(@class), ' '), ' grid_9 alpha omega newscontainer arena ')]/p");
$xpath->query("//div[contains(concat(' ', normalize-space(@class), ' '), ' grid_9 alpha omega newscontainer arena ')]/h3");

They do sometimes come in different orders though, so i would like to catch them in order of appearance in the html. I did use

$xpath->query('//h1 | //p | //h3');

and that worked well to, but also caught some <p> tags outside of the div class specified above. Using them in sequence didn't work at all. Is there a way to combine these queries into one?

Basically extracting all h1,p and h3 tags within a specific div class?

Answer 1

Why don't you try

$xpath->query("//div[contains(concat(' ', normalize-space(@class), ' '), ' grid_9 alpha omega newscontainer arena ')]/*[local-name()='h1' or local-name()='p' or local-name()='h3']");

This should give you the nodes in the order of their appearance restricted to children of the div parent and also in XPath 1.0 which I assume is an unmentioned prerequisite.

Answer 2

When you use // will match any element with this tagname

You must be more specific and i suggest you

$xpath->query('//div/h1 | //div/p | //div/h3');

Xpath - get text from all h1, h3 p tags within a div

Question

2 answers

solution1
0 ACCPTED 2013-12-04 23:12:15

solution2
0 2013-12-04 23:14:22

Xpath - get text from all h1, h3 p tags within a div

Question

2 answers

solution1 0 ACCPTED 2013-12-04 23:12:15

solution2 0 2013-12-04 23:14:22

solution1
0 ACCPTED 2013-12-04 23:12:15

solution2
0 2013-12-04 23:14:22