[英]To access data in web scraping using css select in python using scrapy
I was trying to scrape BHK details from a website using Scrapy with css select in python.我试图从一个网站上使用 Scrapy 和 css select 在 Z23EEEB4347BDD7556BFC6B7EEA9 Could anyone please let me know on how to access the texts- 3 BHK Residential Apartment,3,4 BHK Independent House/Villa- from the below output.
谁能告诉我如何从下面的 output 访问文本 - 3 BHK Residential Apartment,3,4 BHK Independent House/Villa-。
output: output:
{'bhk': ['<tr class="NpsrpTuple__subHead"><td>3 BHK Residential Apartment<!-- '
'--> </td><td></td><td class="undefined"><span>Under '
'Construction</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>3,4 BHK Independent '
'House/Villa<!-- --> </td><td></td><td class="undefined"><span>Under '
'Construction</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
'Apartment<!-- --> </td><td></td><td class="undefined"><span>Ready To '
'Move</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
'Apartment<!-- --> </td><td></td><td class="undefined"><span>Ready To '
'Move</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>3,4,5 BHK Residential '
'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
'-->8518<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
'class="undefined"><span>Ready To Move</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
'-->4226<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
'class="undefined"><span>Ready To Move</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>1,2 BHK Residential '
'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
'-->4700<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
'class="undefined"><span>Ready To Move</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
'-->4199<!-- --> / Sq.Ft. </span></td><td '
'class="undefined"><span>Ready To Move</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>3,4 BHK Residential '
'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
'-->7500<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
'class="undefined"><span>Ready To Move</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>3,4 BHK Independent '
'House/Villa<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ '
'<!-- -->4947<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
'class="undefined"><span>Under Construction</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>4 BHK Residential Apartment<!-- '
'--> </td><td></td><td class="undefined"><span>Under '
'Construction</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>2,3,4 BHK Residential '
'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
'-->7229<!-- --> / Sq.Ft. </span></td><td '
'class="undefined"><span>Ready To Move</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>2,3,4 BHK Residential '
'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
'-->6572<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
'class="undefined"><span>Ready To Move</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
'Apartment<!-- --> </td><td></td><td class="undefined"><span>New '
'Launch</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>3,4 BHK Residential '
'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
'-->6453<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
'class="undefined"><span>Under Construction</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>2 BHK Residential Apartment<!-- '
'--> </td><td></td><td class="undefined"><span>Ready To '
'Move</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>3,4 BHK Independent '
'House/Villa<!-- --> </td><td></td><td class="undefined"><span>New '
'Launch</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>4 BHK Independent '
'House/Villa<!-- --> </td><td></td><td class="undefined"><span>Under '
'Construction</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
'-->6618<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
'class="undefined"><span>Under Construction</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
'-->5900<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
'class="undefined"><span>Under Construction</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
'-->5233<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
'class="undefined"><span>Ready To Move</span></td></tr']
I tried to run the code:我试图运行代码:
bhk = response.css('.NpsrpTuple__subHead').extract() bhk = response.css('.NpsrpTuple__subHead').extract()
Not sure if this is something you want to achieve:不确定这是否是您想要实现的目标:
from scrapy import Selector
item = {'bhk': ['<tr class="NpsrpTuple__subHead"><td>3 BHK Residential Apartment<!-- '
'--> </td><td></td><td class="undefined"><span>Under '
'Construction</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>3,4 BHK Independent '
'House/Villa<!-- --> </td><td></td><td class="undefined"><span>Under '
'Construction</span></td></tr>',
'<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
'Apartment<!-- --> </td><td></td><td class="undefined"><span>Ready To '
'Move</span></td></tr>']
}
items = ""
for i in item['bhk']:
items+=i
response = Selector(text=items)
for elem in response.css('tr > td::text').getall():
print(elem)
Output: Output:
3 BHK Residential Apartment
3,4 BHK Independent House/Villa
2,3 BHK Residential Apartment
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.