简体   繁体   中英

Convert CSV to Nested JSON using / header delimiter

My CSV headers look something like

from/email to/0/email personalization/0/email/ personalization/0/data/first_name personalization/0/data/company_name personalization/0/data/job_title template_id

Output should be:

[
 {
   "from": {
      "email": "me@x.com",
      "name": "Me"
   },
   "to": [
      {
         "email": "mike@x.com"
      }
   ],
   "personalization": [
      {
         "email": "mike@x.com",
         "data": {
            "first_name": "Mike",
            "company_name": "X.com",
            "job_title": "Chef"
         }
      }
   ],
   "template_id": "123456"
},

I tried

csvjson input.csv output.csv
csvtojson input.csv output.csv
csv2json input.csv output.csv
python3 app.py

import csv 
import json 

def csv_to_json(csvFilePath, jsonFilePath):
    jsonArray = []
      
    #read csv file
    with open(csvFilePath, encoding='utf-8') as csvf: 
        #load csv file data using csv library's dictionary reader
        csvReader = csv.DictReader(csvf) 

        #convert each csv row into python dict
        for row in csvReader: 
            #add this python dict to json array
            jsonArray.append(row)
  
    #convert python jsonArray to JSON String and write to file
    with open(jsonFilePath, 'w', encoding='utf-8') as jsonf: 
        jsonString = json.dumps(jsonArray, indent=4)
        jsonf.write(jsonString)
          
csvFilePath = r'outputt1.csv'
jsonFilePath = r'outputt1.json'
csv_to_json(csvFilePath, jsonFilePath)
node app.js

const CSVToJSON = require('csvtojson');

// convert users.csv file to JSON array
CSVToJSON().fromFile('outputt1.csv')
    .then(from => {

        // from is a JSON array
        // log the JSON array
        console.log(from);
    }).catch(err => {
        // log error if any
        console.log(err);
    });

All output some variation of single-line JSON with no nesting.

The only thing that worked was uploading it to https://www.convertcsv.com/csv-to-json.htm and converting each file by hand, but that is obviously not a solution.

I have seen a post recommending Choetl.Json for this exact purpose but was unable to install it on mac

Your problem should be broken down into two parts: parsing CSV data for conversion into JSON, and building a JSON structure following path-like instructions.

For the first part, it is necessary to clarify the formatting of the CSV input, as there is no general standard for CSV, just a fundamental description in the RFC 4180 proposal and a lot of adoptions tailored to specific use cases or data types. For the sake of simplicity, let's assume that records are separated by newlines, fields are separated by commas, and no field delimiters, as the data itself never contains any of these delimiters. Let's further assume that there is exactly one (the first) record representing the headers, and that all records have the exact same number of fields. You may want to adjust these assumption to your actual CSV data.

Then, to read in the CSV data, use the -R option to treat the input as newline-separated lines of raw text, and split the lines using the / operator:

cat input.csv
from/email,to/0/email,personalization/0/email,personalization/0/data/first_name,personalization/0/data/company_name,personalization/0/data/job_title,template_id
me@x.com,mike@x.com,mike@x.com,Mike,X.com,Chef,123456
jq -R '. / ","' input.csv
[
  "from/email",
  "to/0/email",
  "personalization/0/email",
  "personalization/0/data/first_name",
  "personalization/0/data/company_name",
  "personalization/0/data/job_title",
  "template_id"
]
[
  "me@x.com",
  "mike@x.com",
  "mike@x.com",
  "Mike",
  "X.com",
  "Chef",
  "123456"
]

Demo

As for the second part, you can make use of functions like setpath which interpret arrays as object structure paths, then split your header names into an array using / again, and build up your JSON objects by iterating through the fields using a reduce statement. Also I assumed that numbers in the header paths always represent array inices (and never field names that have string names which look like numbers). I converted them using tonumber , and aligned header fields to data fields using transpose :

… | jq -s '
  (.[0] | map(. / "/" | map(tonumber? // .))) as $headers
  | .[1:] | map(
    reduce ([$headers, .] | transpose[]) as [$path, $value] (
      {}; setpath($path; $value)
    )
  )
'
[
  {
    "from": {
      "email": "me@x.com"
    },
    "to": [
      {
        "email": "mike@x.com"
      }
    ],
    "personalization": [
      {
        "email": "mike@x.com",
        "data": {
          "first_name": "Mike",
          "company_name": "X.com",
          "job_title": "Chef"
        }
      }
    ],
    "template_id": "123456"
  }
]

Demo

You might want to try out Miller . It is available as a static binary , so you just need to put the mlr executable somewhere (preferably in your PATH) and you're done with the installation.

mlr --icsv --ojson --jflatsep / cat file.csv

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM