xpath在此网站上不起作用

Question

I am scraping individual listing pages from justproperty.com (individual listing from the original question no longer active). 我正在从justproperty.com抓取单个列表页面（原始问题中的单个列表不再有效）。

I want to get the value of the Ref 我想获得Ref的价值

this is my xpath: 这是我的xpath：

>>> sel.xpath('normalize-space(.//div[@class="info_div"]/table/tbody/tr/td[norma
lize-space(text())="Ref:"]/following-sibling::td[1]/text())').extract()[0]

This has no results in scrapy, despite working in my browser. 尽管可以在我的浏览器中进行操作，但不会造成任何麻烦。

Answer 1

The following works perfectly in lxml.html (with modern Scrapy uses): 以下内容在lxml.html （与现代Scrapy结合使用）中完美lxml.html ：

sel.xpath('.//div[@class="info_div"]//td[text()="Ref:"]/following-sibling::td[1]/text()')

Note that I'm using // to get between the div and the td , not laying out the explicit path. 请注意，我使用//来获得div和td之间的距离，而不是布置显式路径。 I'd have to take a closer look at the document to grok why, but the path given in that area was incorrect. 我必须仔细阅读文档才能理解原因，但是在该区域给出的路径是错误的。

Answer 2

Don't create XPath expression by looking at Firebug or Chrome Dev Tools, they're changing the markup. 不要通过查看Firebug或Chrome开发工具来创建XPath表达式，因为它们会更改标记。 Remove the /tbody axis step and you'll receive exactly what you're look for. 删除/tbody轴步骤，您将准确找到所需的内容。

normalize-space(.//div[@class="info_div"]/table/tr/td[
  normalize-space(text())="Ref:"
]/following-sibling::td[1]/text())

Read Why does my XPath query (scraping HTML tables) only work in Firebug, but not the application I'm developing? 阅读为什么我的XPath查询（抓取HTML表）只能在Firebug中工作，而不能在我正在开发的应用程序中工作？ for more details. 更多细节。

Answer 3

Another XPath that gets the same thing: (.//td[@class='titles']/../td[2])[1] 另一个具有相同功能的XPath： (.//td[@class='titles']/../td[2])[1]

I tried your XPath using XPath Checker and it works fine. 我使用XPath Checker尝试了XPath，并且工作正常。

xpath在此网站上不起作用

问题描述

3 个解决方案

解决方案1
2 已采纳 2014-02-27 18:03:27

解决方案2
2 2014-02-27 23:03:44

解决方案3
0 2014-02-27 18:07:00

xpath在此网站上不起作用

问题描述

3 个解决方案

解决方案1 2 已采纳 2014-02-27 18:03:27

解决方案2 2 2014-02-27 23:03:44

解决方案3 0 2014-02-27 18:07:00

解决方案1
2 已采纳 2014-02-27 18:03:27

解决方案2
2 2014-02-27 23:03:44

解决方案3
0 2014-02-27 18:07:00