[英]Scrapy + python: csv file not exported in the correct order
我正在用我的蜘蛛創建一個 csv 文件,但它給了我一個奇怪的數據順序:
我的代碼:
class GoodmanSpider(scrapy.Spider):
name = "goodmans"
start_urls = ['http://www.goodmans.net/d/1706/brands.htm']
def parse(self, response):
items = TutorialItem()
all_data = response.css('.SubDepartments')
for data in all_data:
category = data.css('.SubDepartments a::text').extract()
category_url = data.css('.SubDepartments a::attr(href)').extract()
items['category'] = category
items['category_url'] = category_url
yield items
我的 items.py 文件
您已將所有物品堆疊成一個。 當您有一個列表時,每個項目都應該是每個鍵的單個值的字典。
嘗試類似:
for cat, url in zip(category, category_url):
item = dict(category=cat, category_url=url)
yield item
這是基於邁克爾的回答的代碼更正。 完美運行
import scrapy
from ..items import TutorialItem
import pandas as pd
class GoodmanSpider(scrapy.Spider):
name = "goodmans"
start_urls = ['http://www.goodmans.net/d/1706/brands.htm']
def parse(self, response):
items = TutorialItem()
all_data = response.css('.SubDepartments')
for data in all_data:
category = data.css('.SubDepartments a::text').extract()
category_url = data.css('.SubDepartments a::attr(href)').extract()
items['category'] = category
items['category_url'] = category_url
for cat, url in zip(category, category_url):
item = dict(category=cat, category_url=url)
yield item
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.