简体   繁体   English

如何使用scrapy刮掉没有类的html标签并更改id?

[英]How to scrape html tags with no classes and changing id using scrapy?

I want to scrape the sold price of property fromhttp://house.speakingsame.com/p.php?q=Brisbane+City&sta=qld我想从http://house.speakingsame.com/p.php?q=Brisbane+City&sta=qld获取房产的售价

It has no formatting, no classes and the content is represented in tables.它没有格式,没有类,内容以表格形式表示。 在此处输入图像描述

在此处输入图像描述

what should I do in this case?在这种情况下我该怎么办? Each table represents each property.每个表代表每个属性。 I need the sold price for each property and hence each table.我需要每处房产的售价,因此需要每张桌子的售价。

response.css('tbody').getall() returns nothing at all. response.css('tbody').getall()什么都不返回。

using xpath you could use:使用 xpath 你可以使用:

for element in response.xpath("//table//table//table"):
    sold = element.xpath(".//b")[0].xpath("./text()").get()
    print(sold)
    date = element.xpath(".//td")[0].xpath("./text()").get()

output:输出:

Sold $640,000
Sold $640,000
Sold $320,000
Sold $320,000
Sold $145,000
Sold $145,000
Sold $145,000
Sold $145,000
Sold $239,000
Sold $239,000
Sold $695,000
Sold $695,000
Sold $740,000
Sold $740,000
Sold $375,000
Sold $375,000
Sold $390,000
Sold $390,000
Sold $695,000
Sold $695,000

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM