简体   繁体   English

Scrapy shell-正确的 xpath 选择器用于从表中获取信息?

[英]Scrapy shell- correct xpath selector for getting info from a table?

I'm trying to obtain the correct Xpath for extracting the information circled in red the image below:我正在尝试获取正确的 Xpath 以提取下图中红色圈出的信息:

enter image description here在此处输入图像描述

I've tried copying the xpath and pasting it to the scrapy shell but it isn't working.我尝试复制 xpath 并将其粘贴到 scrapy shell 但它不起作用。 I'm having difficulties because the information is contained inside a table and every element of the table has the same name.我遇到了困难,因为信息包含在表格中,并且表格的每个元素都具有相同的名称。 The website is该网站是

https://virtualmuebles.com/muebles-sala/mesa-tv-invy-1c-casa-linda-wg https://virtualmuebles.com/muebles-sala/mesa-tv-invy-1c-casa-linda-wg

Assuming the text Marca is constant on all the pages you want to scrape.假设文本Marca在您要抓取的所有页面上都是不变的。 First search for a b element containg the text 'Marca'.首先搜索包含文本“Marca”的b元素。 Find its parent if it is a td element.如果它是td元素,则查找其父元素。 Get the following sibling if it is a td element.如果它是td元素,则获取以下兄弟。 Get its text node:获取其文本节点:

response.xpath("//b[contains(text(),'Marca')]/parent::td/following-sibling::td/text()").get()

otherwise if it is always the second td element of the fourth tr element:否则,如果它始终是第四个tr元素的第二个td元素:

response.xpath("//tr[4]/td[2]/text()").get()

outputs:输出:

'RTA Design'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM