well here is my project/spider , works fine....
# -*- coding: utf-8 -*-
import scrapy
import time
class SccbotakiSpider(scrapy.Spider):
name = 'SccBotaki'
start_urls = ['url']
time.sleep(1)
def parse(self, response):
daten = response.css('#daten').extract()
cartext = response.css('div.car_header > b::text').extract()
spacerimg = response.css('div.rechts > img::attr(src)').extract()
inhalt = response.css('div.inhalt')
prodname = inhalt.css('div.prod-name::text').extract()
artnr = inhalt.css('div.art-nr > span::text').extract()
avaible = inhalt.css('div.ampel > img::attr(src)').extract()
price = inhalt.css('div.preis::text').extract()
for item in zip(prodname,artnr,avaible,price):
scraped_info = {
'prodname' : item[0] ,
'artnr' : item[1] ,
'avaible' : item[2] ,
'price' : item[3] ,
}
yield scraped_info
check out the url inside of image because i cannot use tiny url inside this post URL Image
but i wanted to scrape daten,cartext,spacerimg aswell im gonna get different/bad results btw in settings.py i did like that to export into csv file:
#Export as CSV Feed
FEED_FORMAT = "csv"
FEED_URI = "UltraRacing.csv"
so, my question is why i cannot scrape like my image when im adding "daten,cartext,spacerimg"? if i did scrape all of them together im gonna get in csv just 1 row with all of the informations in 1 cell and if ill remove the "daten,cartext,spacerimg from the loop", ill get the perfect results....
hope this make sense...
你试图zip
不同大小的列表: prodname
, artnr
, avaible
, price
有41元,但daten
和cartext
只有1元和spacerimg
是9个元素。
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.