简体   繁体   English

Xpath是正确的,但是Scrapy无法正常工作

[英]Xpath is correct but Scrapy doesn't work

I'm trying to download two fields from a webpage, I identify the XPath expressions for each one and then run the spider, but nothing is downloaded. 我正在尝试从网页上下载两个字段,我为每个字段标识了XPath表达式,然后运行Spider,但是没有下载任何内容。

The webpage: http://www.morningstar.es/es/funds/snapshot/snapshot.aspx?id=F0GBR04MZH 网页: http//www.morningstar.es/es/funds/snapshot/snapshot.aspx?id = F0GBR04MZH

The field I want to itemize is ISIN . 我要逐项列出的字段是ISIN

The spider runs without errors, but the output is empty. 蜘蛛网没有错误运行,但是输出为空。

Here is the line code: 这是行代码:

item['ISIN'] = response.xpath('//*[@id="overviewQuickstatsDiv"]/table/tbody/tr[5]/td[3]/text()').extract()

Try to remove tbody from XPath: 尝试从XPath删除tbody

'//*[@id="overviewQuickstatsDiv"]/table//tr[5]/td[3]/text()'

Note that this tag is added by your browser while page rendering and it's absent in page source 请注意,此标记是在页面呈现时由您的浏览器添加的,并且在页面源中不存在

PS I suggest you to use IMHO even better XPath: PS我建议您使用恕我直言,甚至更好的XPath:

'//td[.="ISIN"]/following-sibling::td[contains(@class, "text")]/text()'

I think response.selector was not given. 我认为没有给出response.selector。 Try this. 尝试这个。

response.selector.xpath('//*[@id="overviewQuickstatsDiv"]/table/tbody/tr[5]/td[3]/text()').extract()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM