I'm new to Scrapy and I'm trying to scrape a page that has an aria-label on a class:
<body>
<div class="item-price" aria-label="$1.99">
.....
</div>
</body>
I'm trying to extract the label with the following parse on my spider:
def parse(self, response):
price = circular_item.css("div.item-price > aria-label::text").extract()
yield price
When I run the spider I get the following error:
2018-09-02 18:34:03 [scrapy.core.scraper] ERROR: Spider must return Request, BaseItem, dict or None, got 'list' in <GET https://example.com/test.html>
How do I extract the value of aria-label here?
You have several errors in your code:
def parse(self, response):
item = {}
item["price"] = response.xpath('//div[@class="item-price"]/@aria-label').extract_first()
yield item
If you want to use a css extractor instead of xpath:
def parse(self, response):
item = {response.css('div.item-price::attr(aria-label)').extract_first()}
yield item
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.