[英]Scrapy + Python, returning multiple items, issue reading page
I am trying to extract multiple items to a database using Scrapy with python.我正在尝试使用带有 python 的 Scrapy 将多个项目提取到数据库中。 To build my code I used the Scrapy shell to read the page first and test the lines of code related to the data extraction.
为了构建我的代码,我首先使用 Scrapy shell 读取页面并测试与数据提取相关的代码行。
scrapy shell "http://www.goodmans.net/d/1706/brands.htm"
I tried the following function and got the result I wanted (extracting all brands)我尝试了以下功能并得到了我想要的结果(提取所有品牌)
response.css('.SubDepartments a::text').extract()
Then I built the code, run it with scrapy crawl goodmans
and it gave me an error:然后我构建了代码,用
scrapy crawl goodmans
运行它,它给了我一个错误:
import scrapy
import pandas as pd
class GoodmanSpider(scrapy.Spider):
name = "goodmans"
start_urls = ['http://www.goodmans.net/d/1706/brands.htm']
def parse(self, response):
category = response.css('.SubDepartments a::text').extract()
category_url = response.css('.SubDepartments a::attr(href)').extract()
yield {'Category': category, 'url': categoy_url}
The interesting part of the error is not visible in your screenshot.错误的有趣部分在您的屏幕截图中不可见。 The last line says:
最后一行说:
... line 10, in parse
yield {'Category': category, 'url': categoy_url}
NameError: name 'categoy_url' is not defined
So, a simple misspelling :)所以,一个简单的拼写错误:)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.