XPath How to compute 2 nodes and limit results for one

Question

What would be the correct syntax to this :

//footer//a | (//a[not(//footer)] and position() <=200)

Use only //footer if exists, if not, find all //a that are not in //footer and limit this to 200

Answer 1

You were really close. The OR operator already handles your case - if footer contains no <a> nodes underneath it then second OR statement will be captured:

Using python and parsel (scrapy's html parser).

>>> foo = Selector("<footer><a>text</a></footer>")
>>> bar = Selector("<div><a>text</a><a>text2</a><a>text3</a><a>text4</a></div>")
>>> foo.xpath("//footer//a | //a[position()<=2]").get()
'<a>text</a>'

>>> bar.xpath("//footer//a | //a[position()<=2]").extract()
['<a>text</a>', '<a>text2</a>']

Note: I used 2 instead of 200 for brevity.

XPath How to compute 2 nodes and limit results for one

Question

1 answers

solution1
2 ACCPTED 2021-11-04 10:41:03

XPath How to compute 2 nodes and limit results for one

Question

1 answers

solution1 2 ACCPTED 2021-11-04 10:41:03

solution1
2 ACCPTED 2021-11-04 10:41:03