简体   繁体   中英

Convert CSV to JSON file in python

CSV文件 Above csv file which contains nearly 2000 rows.

I want to parse CSV file line by line and convert it to JSON and send over websocket.

I found some code online which converts CSV to JSON as follows:

import csv
import json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("FirstName","LastName","IDNumber","Message")
reader = csv.DictReader( csvfile, fieldnames)
for row in reader:
    json.dump(row, jsonfile)
    jsonfile.write('\n')

But the problem with above code is that we need to mention the field names to parse the CSV. Since I have more than 2000 rows, it is not a feasible solution.

Can anyone suggest how to parse CSV file line by line and convert it into JSON without specifying fieldnames?

Python CSV to JSON

To convert CSV to JSON in Python, follow these steps:

  1. Initialize a Python List.
  2. Read the lines of CSV file using csv.DictReader() function.
  3. Convert each line into a dictionary. Add the dictionary to the Python List created in step 1.
  4. Convert the Python List to JSON String using json.dumps() .
  5. You may write the JSON String to a JSON file.

data.csv

  • For the test I made 100.000 lines in a csv file with copy/paste, and the whole conversion takes about half a second with Apple's M1 Chip while the presented example took only 0.0005 seconds.

在此处输入图片说明

column_1,column_2,column_3
value_1_1,value_1_2,value_1_3
value_2_1,value_2_2,value_2_3
value_3_1,value_3_2,value_3_3

Python Program

import csv 
import json
import time

def csv_to_json(csvFilePath, jsonFilePath):
    jsonArray = []
      
    #read csv file
    with open(csvFilePath, encoding='utf-8') as csvf: 
        #load csv file data using csv library's dictionary reader
        csvReader = csv.DictReader(csvf) 

        #convert each csv row into python dict
        for row in csvReader: 
            #add this python dict to json array
            jsonArray.append(row)
  
    #convert python jsonArray to JSON String and write to file
    with open(jsonFilePath, 'w', encoding='utf-8') as jsonf: 
        jsonString = json.dumps(jsonArray, indent=4)
        jsonf.write(jsonString)
          
csvFilePath = r'data.csv'
jsonFilePath = r'data.json'

start = time.perf_counter()
csv_to_json(csvFilePath, jsonFilePath)
finish = time.perf_counter()

print(f"Conversion 100.000 rows completed successfully in {finish - start:0.4f} seconds")

output: data.json

Conversion 100.000 rows completed successfully in 0.5169 seconds
[
    {
        "column_1": "value_1_1",
        "column_2": "value_1_2",
        "column_3": "value_1_3"
    },
    {
        "column_1": "value_2_1",
        "column_2": "value_2_2",
        "column_3": "value_2_3"
    },
    {
        "column_1": "value_3_1",
        "column_2": "value_3_2",
        "column_3": "value_3_3"
    }
]

If you are happy with the solution you have, and the only thing that bothers you is how to enter the 'long' list of column headers, i suggest you read the first (header) line of the CSV using something like reader.next(),

import csv

with open('your_CSV.csv') as csvFile:
    reader = csv.reader(csvFile)
    field_names_list = reader.next()

and then split the string obtained to a list using str.split(',') .

The list you get can be then be fed to

fieldnames = (---from the above code block ---)

line of your code.

Assuming your CSV has a header row : just remove the fieldnames argument from DictReader

If the fieldnames parameter is omitted, the values in the first row of the file f will be used as the fieldnames. in https://docs.python.org/2/library/csv.html

import csv
import json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')


reader = csv.DictReader(csvfile)
for row in reader:
    json.dump(row, jsonfile)
    jsonfile.write('\n')

You could try this:

import csv 
import json 

def csv_to_json(csvFilePath, jsonFilePath):
    jsonArray = []
      
    with open(csvFilePath, encoding='utf-8') as csvf: 
        csvReader = csv.DictReader(csvf) 

        for row in csvReader: 
            jsonArray.append(row)
  
    with open(jsonFilePath, 'w', encoding='utf-8') as jsonf: 
        jsonString = json.dumps(jsonArray, indent=4)
        jsonf.write(jsonString)
          
csvFilePath = r'data.csv'
jsonFilePath = r'data.json'
csv_to_json(csvFilePath, jsonFilePath)

I converted a 200MB file with 600K+ rows and it worked very well.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM