Python Scrapy Extract Value of aria-label

Question

I'm new to Scrapy and I'm trying to scrape a page that has an aria-label on a class:

<body>
  <div class="item-price" aria-label="$1.99">
    .....
  </div>
</body>

I'm trying to extract the label with the following parse on my spider:

def parse(self, response):
   price = circular_item.css("div.item-price > aria-label::text").extract()
   yield price

When I run the spider I get the following error:

2018-09-02 18:34:03 [scrapy.core.scraper] ERROR: Spider must return Request, BaseItem, dict or None, got 'list' in <GET https://example.com/test.html>

How do I extract the value of aria-label here?

Answer 1

You have several errors in your code:

def parse(self, response):
   item = {}
   item["price"] = response.xpath('//div[@class="item-price"]/@aria-label').extract_first()
   yield item

Answer 2

If you want to use a css extractor instead of xpath:

def parse(self, response):
    item = {response.css('div.item-price::attr(aria-label)').extract_first()}
    yield item

Python Scrapy Extract Value of aria-label

Question

2 answers

solution1
1 ACCPTED 2018-09-02 23:18:40

solution2
1 2018-09-03 09:36:14

Python Scrapy Extract Value of aria-label

Question

2 answers

solution1 1 ACCPTED 2018-09-02 23:18:40

solution2 1 2018-09-03 09:36:14

solution1
1 ACCPTED 2018-09-02 23:18:40

solution2
1 2018-09-03 09:36:14