[英]How do I return more than one item with Scrapy?
I'm trying to learn the basics of Scrapy.我正在尝试学习 Scrapy 的基础知识。 I've written the below spider to scrape one of the practice websites, books.toscrape.com .
我写了下面的蜘蛛来抓取练习网站之一, books.toscrape.com 。 The spider scrapes the site and when I just tell it to
print
title and price it returns them for every book on the site but when I use yield
, as below it only returns the information for the last book listed on the site.蜘蛛抓取网站,当我告诉它
print
标题和价格时,它会为网站上的每本书返回它们,但是当我使用yield
,如下所示,它只返回网站上列出的最后一本书的信息。
I've no doubt my mistake's really simple but I can't work out what it is.我毫不怀疑我的错误真的很简单,但我无法弄清楚它是什么。
Can anyone tell me why this only scrapes the final title and price listing on the site?谁能告诉我为什么这只会刮掉网站上的最终标题和价格清单?
Thanks!谢谢!
import scrapy
class FirstSpider(scrapy.Spider):
name="CW"
start_urls = ['http://books.toscrape.com/']
def parse(self,response):
books = response.xpath('//article[@class="product_pod"]')
for item in books:
title = item.xpath('.//h3/a/@title').getall()
price = item.xpath('.//div/p[@class="price_color"]').getall()
yield {
'title': title,
'price': price,
}
You misindented the yield: Fixed:你错误地缩进了产量:固定:
import scrapy
class FirstSpider(scrapy.Spider):
name="CW"
start_urls = ['http://books.toscrape.com/']
def parse(self,response):
books = response.xpath('//article[@class="product_pod"]')
for item in books:
title = item.xpath('.//h3/a/@title').getall()
price = item.xpath('.//div/p[@class="price_color"]').getall()
yield {
'title': title,
'price': price,
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.