简体   繁体   English

使用 / header 分隔符将 CSV 转换为嵌套 JSON

[英]Convert CSV to Nested JSON using / header delimiter

My CSV headers look something like我的 CSV 标头看起来像

from/email来自/电子邮件 to/0/email到/0/电子邮件 personalization/0/email/个性化/0/电子邮件/ personalization/0/data/first_name个性化/0/data/first_name personalization/0/data/company_name个性化/0/数据/公司名称 personalization/0/data/job_title个性化/0/data/job_title template_id模板编号

Output should be: Output 应该是:

[
 {
   "from": {
      "email": "me@x.com",
      "name": "Me"
   },
   "to": [
      {
         "email": "mike@x.com"
      }
   ],
   "personalization": [
      {
         "email": "mike@x.com",
         "data": {
            "first_name": "Mike",
            "company_name": "X.com",
            "job_title": "Chef"
         }
      }
   ],
   "template_id": "123456"
},

I tried我试过了

csvjson input.csv output.csv
csvtojson input.csv output.csv
csv2json input.csv output.csv
python3 app.py

import csv 
import json 

def csv_to_json(csvFilePath, jsonFilePath):
    jsonArray = []
      
    #read csv file
    with open(csvFilePath, encoding='utf-8') as csvf: 
        #load csv file data using csv library's dictionary reader
        csvReader = csv.DictReader(csvf) 

        #convert each csv row into python dict
        for row in csvReader: 
            #add this python dict to json array
            jsonArray.append(row)
  
    #convert python jsonArray to JSON String and write to file
    with open(jsonFilePath, 'w', encoding='utf-8') as jsonf: 
        jsonString = json.dumps(jsonArray, indent=4)
        jsonf.write(jsonString)
          
csvFilePath = r'outputt1.csv'
jsonFilePath = r'outputt1.json'
csv_to_json(csvFilePath, jsonFilePath)
node app.js

const CSVToJSON = require('csvtojson');

// convert users.csv file to JSON array
CSVToJSON().fromFile('outputt1.csv')
    .then(from => {

        // from is a JSON array
        // log the JSON array
        console.log(from);
    }).catch(err => {
        // log error if any
        console.log(err);
    });

All output some variation of single-line JSON with no nesting.所有 output 单行 JSON 的一些变体,没有嵌套。

The only thing that worked was uploading it to https://www.convertcsv.com/csv-to-json.htm and converting each file by hand, but that is obviously not a solution.唯一可行的是将其上传到https://www.convertcsv.com/csv-to-json.htm并手动转换每个文件,但这显然不是解决方案。

I have seen a post recommending Choetl.Json for this exact purpose but was unable to install it on mac我看到一个帖子推荐 Choetl.Json 用于这个确切的目的,但无法在 mac 上安装它

Your problem should be broken down into two parts: parsing CSV data for conversion into JSON, and building a JSON structure following path-like instructions.您的问题应该分为两部分:解析 CSV 数据以转换为 JSON,并按照类似路径的说明构建 JSON 结构。

For the first part, it is necessary to clarify the formatting of the CSV input, as there is no general standard for CSV, just a fundamental description in the RFC 4180 proposal and a lot of adoptions tailored to specific use cases or data types.对于第一部分,有必要澄清 CSV 输入的格式,因为 CSV 没有通用标准,只是RFC 4180 提案中的基本描述以及针对特定用例或数据类型量身定制的大量采用。 For the sake of simplicity, let's assume that records are separated by newlines, fields are separated by commas, and no field delimiters, as the data itself never contains any of these delimiters.为了简单起见,我们假设记录由换行符分隔,字段由逗号分隔,并且没有字段分隔符,因为数据本身从不包含任何这些分隔符。 Let's further assume that there is exactly one (the first) record representing the headers, and that all records have the exact same number of fields.让我们进一步假设恰好有一个(第一个)记录表示标题,并且所有记录都具有完全相同数量的字段。 You may want to adjust these assumption to your actual CSV data.您可能希望将这些假设调整为您的实际 CSV 数据。

Then, to read in the CSV data, use the -R option to treat the input as newline-separated lines of raw text, and split the lines using the / operator:然后,要读入 CSV 数据,请使用-R选项将输入视为以换行符分隔的原始文本行,并使用/运算符拆分行:

cat input.csv
from/email,to/0/email,personalization/0/email,personalization/0/data/first_name,personalization/0/data/company_name,personalization/0/data/job_title,template_id
me@x.com,mike@x.com,mike@x.com,Mike,X.com,Chef,123456
jq -R '. / ","' input.csv
[
  "from/email",
  "to/0/email",
  "personalization/0/email",
  "personalization/0/data/first_name",
  "personalization/0/data/company_name",
  "personalization/0/data/job_title",
  "template_id"
]
[
  "me@x.com",
  "mike@x.com",
  "mike@x.com",
  "Mike",
  "X.com",
  "Chef",
  "123456"
]

Demo演示

As for the second part, you can make use of functions like setpath which interpret arrays as object structure paths, then split your header names into an array using / again, and build up your JSON objects by iterating through the fields using a reduce statement. As for the second part, you can make use of functions like setpath which interpret arrays as object structure paths, then split your header names into an array using / again, and build up your JSON objects by iterating through the fields using a reduce statement. Also I assumed that numbers in the header paths always represent array inices (and never field names that have string names which look like numbers).我还假设 header 路径中的数字始终表示数组 inices(并且从不具有看起来像数字的字符串名称的字段名称)。 I converted them using tonumber , and aligned header fields to data fields using transpose :我使用tonumber将它们转换,并使用transpose将 header 字段与数据字段对齐:

… | jq -s '
  (.[0] | map(. / "/" | map(tonumber? // .))) as $headers
  | .[1:] | map(
    reduce ([$headers, .] | transpose[]) as [$path, $value] (
      {}; setpath($path; $value)
    )
  )
'
[
  {
    "from": {
      "email": "me@x.com"
    },
    "to": [
      {
        "email": "mike@x.com"
      }
    ],
    "personalization": [
      {
        "email": "mike@x.com",
        "data": {
          "first_name": "Mike",
          "company_name": "X.com",
          "job_title": "Chef"
        }
      }
    ],
    "template_id": "123456"
  }
]

Demo演示

You might want to try out Miller .你可能想试试米勒 It is available as a static binary , so you just need to put the mlr executable somewhere (preferably in your PATH) and you're done with the installation.它以static 二进制文件的形式提供,因此您只需将mlr可执行文件放在某处(最好在您的 PATH 中)即可完成安装。

mlr --icsv --ojson --jflatsep / cat file.csv

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM