简体   繁体   English

To access data in web scraping using css select in python using scrapy

[英]To access data in web scraping using css select in python using scrapy

I was trying to scrape BHK details from a website using Scrapy with css select in python.我试图从一个网站上使用 Scrapy 和 css select 在 Z23EEEB4347BDD7556BFC6B7EEA9 Could anyone please let me know on how to access the texts- 3 BHK Residential Apartment,3,4 BHK Independent House/Villa- from the below output.谁能告诉我如何从下面的 output 访问文本 - 3 BHK Residential Apartment,3,4 BHK Independent House/Villa-。

output: output:

{'bhk': ['<tr class="NpsrpTuple__subHead"><td>3 BHK Residential Apartment<!-- '
         '--> </td><td></td><td class="undefined"><span>Under '
         'Construction</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>3,4 BHK Independent '
         'House/Villa<!-- --> </td><td></td><td class="undefined"><span>Under '
         'Construction</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
         'Apartment<!-- --> </td><td></td><td class="undefined"><span>Ready To '
         'Move</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
         'Apartment<!-- --> </td><td></td><td class="undefined"><span>Ready To '
         'Move</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>3,4,5 BHK Residential '
         'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
         '-->8518<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
         'class="undefined"><span>Ready To Move</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
         'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
         '-->4226<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
         'class="undefined"><span>Ready To Move</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>1,2 BHK Residential '
         'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
         '-->4700<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
         'class="undefined"><span>Ready To Move</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
         'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
         '-->4199<!-- --> / Sq.Ft. </span></td><td '
         'class="undefined"><span>Ready To Move</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>3,4 BHK Residential '
         'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
         '-->7500<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
         'class="undefined"><span>Ready To Move</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>3,4 BHK Independent '
         'House/Villa<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ '
         '<!-- -->4947<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
         'class="undefined"><span>Under Construction</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>4 BHK Residential Apartment<!-- '
         '--> </td><td></td><td class="undefined"><span>Under '
         'Construction</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>2,3,4 BHK Residential '
         'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
         '-->7229<!-- --> / Sq.Ft. </span></td><td '
         'class="undefined"><span>Ready To Move</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>2,3,4 BHK Residential '
         'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
         '-->6572<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
         'class="undefined"><span>Ready To Move</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
         'Apartment<!-- --> </td><td></td><td class="undefined"><span>New '
         'Launch</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>3,4 BHK Residential '
         'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
         '-->6453<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
         'class="undefined"><span>Under Construction</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>2 BHK Residential Apartment<!-- '
         '--> </td><td></td><td class="undefined"><span>Ready To '
         'Move</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>3,4 BHK Independent '
         'House/Villa<!-- --> </td><td></td><td class="undefined"><span>New '
         'Launch</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>4 BHK Independent '
         'House/Villa<!-- --> </td><td></td><td class="undefined"><span>Under '
         'Construction</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
         'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
         '-->6618<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
         'class="undefined"><span>Under Construction</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
         'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
         '-->5900<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
         'class="undefined"><span>Under Construction</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
         'Apartment<!-- --> </td><td><span class="NpsrpTuple__webRupee">₹ <!-- '
         '-->5233<!-- --> / Sq.Ft. <!-- -->Onwards</span></td><td '
         'class="undefined"><span>Ready To Move</span></td></tr']

I tried to run the code:我试图运行代码:

bhk = response.css('.NpsrpTuple__subHead').extract() bhk = response.css('.NpsrpTuple__subHead').extract()

Not sure if this is something you want to achieve:不确定这是否是您想要实现的目标:

from scrapy import Selector

item = {'bhk': ['<tr class="NpsrpTuple__subHead"><td>3 BHK Residential Apartment<!-- '
         '--> </td><td></td><td class="undefined"><span>Under '
         'Construction</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>3,4 BHK Independent '
         'House/Villa<!-- --> </td><td></td><td class="undefined"><span>Under '
         'Construction</span></td></tr>',
         '<tr class="NpsrpTuple__subHead"><td>2,3 BHK Residential '
         'Apartment<!-- --> </td><td></td><td class="undefined"><span>Ready To '
         'Move</span></td></tr>']
    }


items = ""
for i in item['bhk']:
    items+=i

response = Selector(text=items)
for elem in response.css('tr > td::text').getall():
    print(elem)

Output: Output:

3 BHK Residential Apartment
3,4 BHK Independent House/Villa
2,3 BHK Residential Apartment

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM