Issue storing data in csv using scrapy

Question

Below is my parse method of scrapy spider. My expected output in csv is three columns with corresponding values. Although in terminal output I get all the three columns (even it shows 84 items stored in output.csv, which correct). but in actual output file I only 1st column "Title. help appreciated

EDIT:In JSON all the data is there

    def parse(self, response):
        for titl in response.xpath('//span[@class="jv-job-list-title"]/text()').extract():
            title = titl.strip()
            yield {"Title":title}
        for dep in response.xpath('//span[@class="jv-job-list-title"]/text()').extract():
            department = dep.strip()
            yield{"Department":department}
        for countr in response.xpath('//td[@class="jv-job-list-name"]/span[2]/text()').extract():
            country = countr.strip()
            yield{"Country":country}
scrapy crawl task -o output.csv

Complete code:

class TaskUs(scrapy.Spider):
    name = 'task'
    start_urls = ["https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0"]

    # def start_requests(self):
    #     for URL in self.start_urls:
    #         yield scrapy.Request(url=URL, meta={'proxy': 'http://103.241.227.108:6666'}, callback=self.parse)

    def parse(self, response):
        # for titl in response.xpath('//span[@class="jv-job-list-title"]/text()').extract():
        #     title = titl.strip()
        #     yield {"Title":title}
        # for dep in response.xpath('//span[@class="jv-job-list-category"]/text()').extract():
        #     department = dep.strip()
        #     yield{"Department":department}
        # for countr in response.xpath('//td[@class="jv-job-list-name"]/span[2]/text()').extract():
        #     country = countr.strip()
        #     yield{"Country":country}
        ti = response.xpath('//span[@class="jv-job-list-title"]/text()').extract()
        de = response.xpath('//span[@class="jv-job-list-category"]/text()').extract()
        co = response.xpath('//td[@class="jv-job-list-name"]/span[2]/text()').extract()
        yield{'titl':ti, 'Depa': de, "Cou": co}

Answer 1

Here is the solution:

CODE:

import scrapy
class TaskUs(scrapy.Spider):
    name = 'task'
    start_urls = ["https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0"]

    def parse(self, response):
        tables = response.xpath('//*[@class="jv-job-list jv-search-list"]/tbody/tr')
        for table in tables:
            yield {
                'Title':table.xpath('.//*[@class="jv-job-list-name"]/span[1]/text()').get()
            }

OUTPUT:

{'Title': 'Real Time Analyst'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Senior Workforce Manager'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Senior Workforce Manager'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Senior Workforce Manager'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'VP of Workforce Analytics'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Management Positions'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Manager'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Manager'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Manager'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Manager'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Manager'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Manager'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Manager'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Planner'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Planner'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Planner'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Planner'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Planner'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Planner'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Planner'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Planner'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Planner'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Planner'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Supervisor'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Supervisor'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Supervisor'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Supervisor'}
2021-08-15 21:51:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://jobs.jobvite.com/taskus-inc/search?c=Workforce%20Management&p=0>
{'Title': 'Workforce Supervisor'}
2021-08-15 21:51:14 [scrapy.core.engine] INFO: Closing spider (finished)
2021-08-15 21:51:14 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 1,
 'downloader/exception_type_count/twisted.internet.error.TCPTimedOutError': 1,
 'downloader/request_bytes': 686,
 'downloader/request_count': 2,
 'downloader/request_method_count/GET': 2,
 'downloader/response_bytes': 15293,
 'downloader/response_count': 1,
 'downloader/response_status_count/200'

Issue storing data in csv using scrapy

Question

1 answers

solution1
0 ACCPTED 2021-08-15 15:54:17

Issue storing data in csv using scrapy

Question

1 answers

solution1 0 ACCPTED 2021-08-15 15:54:17

solution1
0 ACCPTED 2021-08-15 15:54:17