[英]Scrapy shell- correct xpath selector for getting info from a table?
I'm trying to obtain the correct Xpath for extracting the information circled in red the image below:我正在尝试获取正确的 Xpath 以提取下图中红色圈出的信息:
enter image description here在此处输入图像描述
I've tried copying the xpath and pasting it to the scrapy shell but it isn't working.我尝试复制 xpath 并将其粘贴到 scrapy shell 但它不起作用。 I'm having difficulties because the information is contained inside a table and every element of the table has the same name.
我遇到了困难,因为信息包含在表格中,并且表格的每个元素都具有相同的名称。 The website is
该网站是
https://virtualmuebles.com/muebles-sala/mesa-tv-invy-1c-casa-linda-wg https://virtualmuebles.com/muebles-sala/mesa-tv-invy-1c-casa-linda-wg
Assuming the text Marca
is constant on all the pages you want to scrape.假设文本
Marca
在您要抓取的所有页面上都是不变的。 First search for a b
element containg the text 'Marca'.首先搜索包含文本“Marca”的
b
元素。 Find its parent if it is a td
element.如果它是
td
元素,则查找其父元素。 Get the following sibling if it is a td
element.如果它是
td
元素,则获取以下兄弟。 Get its text node:获取其文本节点:
response.xpath("//b[contains(text(),'Marca')]/parent::td/following-sibling::td/text()").get()
otherwise if it is always the second td
element of the fourth tr
element:否则,如果它始终是第四个
tr
元素的第二个td
元素:
response.xpath("//tr[4]/td[2]/text()").get()
outputs:输出:
'RTA Design'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.