[英]Scrapy print items to csv, each in own row
I am scraping amazon via Scrapy and attempting to export the price and name of the product to a csv file.我正在通过 Scrapy 抓取亚马逊并尝试将产品的价格和名称导出到 csv 文件。 When I do that, Scrapy appends the items to a list it seems and each row of the csv is a list of products for that page (the same applies to the price column).当我这样做时,Scrapy 将项目附加到一个列表中,csv 的每一行都是该页面的产品列表(同样适用于价格列)。 I want each item and its respective price to be printed to its own row in a CSV file.我希望将每个项目及其各自的价格打印到 CSV 文件中的自己的行中。 Below is my scraping code.下面是我的抓取代码。
class ScrapeSpider(scrapy.Spider):
name = 'scrape'
start_urls = ['https://www.amazon.com/s?i=aps&k=laptop&ref=nb_sb_noss_1&url=search-alias%3Daps']
def parse(self, response):
item = AmazonItem()
name = '\n'.join(response.css('.a-text-normal.a-color-base.a-size-medium').css('::text').extract())
price = '\n'.join(response.css('.a-offscreen').css('::text').extract())
item['name'] = name
item['price'] = price
yield item
for next_page in response.css('.a-pagination .a-last a'):
yield response.follow(next_page, self.parse)
A picture of the resulting csv file生成的 csv 文件的图片
Below is the code run in the terminal to execute the scrape:以下是在终端中运行以执行刮取的代码:
scrapy crawl scrape -o data.csv
Create a list of selectors containing each item and iterate through them creating a new selector called product.创建一个包含每个项目的选择器列表,并遍历它们创建一个名为 product 的新选择器。 Then extract the data individually.然后单独提取数据。
def parse(self, response):
items = response.css('.s-result-list .sg-col-inner')
for product in items:
item=AmazonItem()
item['name'] = product.css('span.a-text-normal::text').get()
item['price'] = product.css('.a-offscreen::text').get()
yield item
next_page = response.css('.a-last::attr(href)').get()
if next_page:
yield scrapy.Request(response.urljoin(next_page), callback=self.parse)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.