简体   繁体   English

xpath在此网站上不起作用

[英]xpath doesn't work in this website

I am scraping individual listing pages from justproperty.com (individual listing from the original question no longer active). 我正在从justproperty.com抓取单个列表页面(原始问题中的单个列表不再有效)。

I want to get the value of the Ref 我想获得Ref的价值

this is my xpath: 这是我的xpath:

>>> sel.xpath('normalize-space(.//div[@class="info_div"]/table/tbody/tr/td[norma
lize-space(text())="Ref:"]/following-sibling::td[1]/text())').extract()[0]

This has no results in scrapy, despite working in my browser. 尽管可以在我的浏览器中进行操作,但不会造成任何麻烦。

The following works perfectly in lxml.html (with modern Scrapy uses): 以下内容在lxml.html (与现代Scrapy结合使用)中完美lxml.html

sel.xpath('.//div[@class="info_div"]//td[text()="Ref:"]/following-sibling::td[1]/text()')

Note that I'm using // to get between the div and the td , not laying out the explicit path. 请注意,我使用//来获得divtd之间的距离,而不是布置显式路径。 I'd have to take a closer look at the document to grok why, but the path given in that area was incorrect. 我必须仔细阅读文档才能理解原因,但是在该区域给出的路径是错误的。

Don't create XPath expression by looking at Firebug or Chrome Dev Tools, they're changing the markup. 不要通过查看Firebug或Chrome开发工具来创建XPath表达式,因为它们会更改标记。 Remove the /tbody axis step and you'll receive exactly what you're look for. 删除/tbody轴步骤,您将准确找到所需的内容。

normalize-space(.//div[@class="info_div"]/table/tr/td[
  normalize-space(text())="Ref:"
]/following-sibling::td[1]/text())

Read Why does my XPath query (scraping HTML tables) only work in Firebug, but not the application I'm developing? 阅读为什么我的XPath查询(抓取HTML表)只能在Firebug中工作,而不能在我正在开发的应用程序中工作? for more details. 更多细节。

Another XPath that gets the same thing: (.//td[@class='titles']/../td[2])[1] 另一个具有相同功能的XPath: (.//td[@class='titles']/../td[2])[1]

I tried your XPath using XPath Checker and it works fine. 我使用XPath Checker尝试了XPath,并且工作正常。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM