简体   繁体   English

如何使用 Scrapy 退回不止一件商品?

[英]How do I return more than one item with Scrapy?

I'm trying to learn the basics of Scrapy.我正在尝试学习 Scrapy 的基础知识。 I've written the below spider to scrape one of the practice websites, books.toscrape.com .我写了下面的蜘蛛来抓取练习网站之一, books.toscrape.com The spider scrapes the site and when I just tell it to print title and price it returns them for every book on the site but when I use yield , as below it only returns the information for the last book listed on the site.蜘蛛抓取网站,当我告诉它print标题和价格时,它会为网站上的每本书返回它们,但是当我使用yield ,如下所示,它只返回网站上列出的最后一本书的信息。

I've no doubt my mistake's really simple but I can't work out what it is.我毫不怀疑我的错误真的很简单,但我无法弄清楚它是什么。

Can anyone tell me why this only scrapes the final title and price listing on the site?谁能告诉我为什么这只会刮掉网站上的最终标题和价格清单?

Thanks!谢谢!

import scrapy

class FirstSpider(scrapy.Spider):
name="CW"
start_urls = ['http://books.toscrape.com/']

def parse(self,response):
    books = response.xpath('//article[@class="product_pod"]')

    for item in books:
        title = item.xpath('.//h3/a/@title').getall()
        price = item.xpath('.//div/p[@class="price_color"]').getall()

    yield {
        'title': title,
        'price': price,
    }

You misindented the yield: Fixed:你错误地缩进了产量:固定:

import scrapy

class FirstSpider(scrapy.Spider):
    name="CW"
    start_urls = ['http://books.toscrape.com/']

def parse(self,response):
    books = response.xpath('//article[@class="product_pod"]')

    for item in books:
        title = item.xpath('.//h3/a/@title').getall()
        price = item.xpath('.//div/p[@class="price_color"]').getall()

        yield {
            'title': title,
            'price': price,
        }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM