xpath doesn't work in this website

Question

I am scraping individual listing pages from justproperty.com (individual listing from the original question no longer active).

I want to get the value of the Ref

this is my xpath:

>>> sel.xpath('normalize-space(.//div[@class="info_div"]/table/tbody/tr/td[norma
lize-space(text())="Ref:"]/following-sibling::td[1]/text())').extract()[0]

This has no results in scrapy, despite working in my browser.

Answer 1

The following works perfectly in lxml.html (with modern Scrapy uses):

sel.xpath('.//div[@class="info_div"]//td[text()="Ref:"]/following-sibling::td[1]/text()')

Note that I'm using // to get between the div and the td , not laying out the explicit path. I'd have to take a closer look at the document to grok why, but the path given in that area was incorrect.

Answer 2

Don't create XPath expression by looking at Firebug or Chrome Dev Tools, they're changing the markup. Remove the /tbody axis step and you'll receive exactly what you're look for.

normalize-space(.//div[@class="info_div"]/table/tr/td[
  normalize-space(text())="Ref:"
]/following-sibling::td[1]/text())

Read Why does my XPath query (scraping HTML tables) only work in Firebug, but not the application I'm developing? for more details.

Answer 3

Another XPath that gets the same thing: (.//td[@class='titles']/../td[2])[1]

I tried your XPath using XPath Checker and it works fine.

xpath doesn't work in this website

Question

3 answers

solution1
2 ACCPTED 2014-02-27 18:03:27

solution2
2 2014-02-27 23:03:44

solution3
0 2014-02-27 18:07:00

xpath doesn't work in this website

Question

3 answers

solution1 2 ACCPTED 2014-02-27 18:03:27

solution2 2 2014-02-27 23:03:44

solution3 0 2014-02-27 18:07:00

solution1
2 ACCPTED 2014-02-27 18:03:27

solution2
2 2014-02-27 23:03:44

solution3
0 2014-02-27 18:07:00