繁体   English   中英

将 CSV 文件转换为 JSON 文件

[英]convert a CSV file to JSON file

我正在尝试根据列值将 CSV 文件转换为 JSON 文件。 csv 文件看起来有点像这样。

ID        Name          Age         
CSE001    John           18
CSE002    Marie          20
ECE001    Josh           22
ECE002    Peter          23

目前我正在使用以下代码获取 json 文件。

import csv
import json
 
def csv_to_json(csv_file_path, json_file_path):
    
    data_dict = {}
 
    with open(csv_file_path, encoding = 'utf-8') as csv_file_handler:
        csv_reader = csv.DictReader(csv_file_handler)
 
        for rows in csv_reader:
            
            key = rows['ID']
            data_dict[key] = rows

    with open(json_file_path, 'w', encoding = 'utf-8') as json_file_handler:
        json_file_handler.write(json.dumps(data_dict, indent = 4))

OUTPUT:

**{  
  "CSE001":{ 
         "ID":"CSE001",
         "Name":"John",
         "Age":18
        }
 "CSE002":{
        "ID":"CSE002",
        "Name":"Marie",
        "Age":20
       }
"ECE001":{
       "ID":"ECE001",
       "Name":"Josh",
       "Age":22
      }
"ECE002":{
       "ID":"ECE002",
       "Name":"Peter",
       "Age":23
      }
}**

我希望我的 output 根据 ID 值为 CSE 和 ECE 生成两个单独的 json 文件。 有没有办法实现这个 output.

必填 Output:

CSE.json:

{  
    "CSE001":{ 
             "ID":"CSE001",
             "Name":"John",
             "Age":18
            }
   "CSE002":{
           "ID":"CSE002",
            "Name":"Marie",
            "Age":20
           }
}

ECE.json:

{
 "ECE001":{
           "ID":"ECE001",
           "Name":"Josh",
           "Age":22
          }
 "ECE002":{
           "ID":"ECE002",
           "Name":"Peter",
           "Age":23
          }
    }

我建议你使用pandas,这样会更容易。

代码可能如下所示:

import pandas as pd

def csv_to_json(csv_file_path):
    df = pd.read_csv(csv_file_path)

    df_CSE = df[df['ID'].str.contains('CSE')]
    df_ECE = df[df['ID'].str.contains('ECE')]

    df_CSE.to_json('CSE.json')
    df_ECE.to_json('ESE.json')

可以创建dataframe然后进行如下操作

import pandas as pd
df = pd.DataFrame.from_dict({  
  "CSE001":{ 
         "ID":"CSE001",
         "Name":"John",
         "Age":18
        },
 "CSE002":{
        "ID":"CSE002",
        "Name":"Marie",
        "Age":20
       },
"ECE001":{
       "ID":"ECE001",
       "Name":"Josh",
       "Age":22
      },
"ECE002":{
       "ID":"ECE002",
       "Name":"Peter",
       "Age":23
      }
},orient='index')

df["id_"] = df["ID"].str[0:2] # temp column for storing first two chars
grps = df.groupby("id_")[["ID", "Name", "Age"]]
for k, v in grps:
  print(v.to_json(orient="index")) # you can create json file as well

您可以将每一行存储到二级字典中,顶层是 ID 的前 3 个字符。

然后这些可以写出到单独的文件中,密钥是文件名的一部分:

from collections import defaultdict
import csv
import json

 
def csv_to_json(csv_file_path, json_base_path):
    data_dict = defaultdict(dict)
 
    with open(csv_file_path, encoding = 'utf-8') as csv_file_handler:
        csv_reader = csv.DictReader(csv_file_handler)
 
        for row in csv_reader:
            key = row['ID'][:3]
            data_dict[key][row['ID']] = row
    
    for key, values in data_dict.items():
        with open(f'{json_base_path}_{key}.json', 'w', encoding='utf-8') as json_file_handler:
            json_file_handler.write(json.dumps(values, indent = 4))
                                                 
csv_to_json('input.csv', 'output')

defaultdict 用于避免在使用之前先测试密钥是否已经存在。

这将创建output_CSE.jsonoutput_ECE.json ,例如

{
    "ECE001": {
        "ID": "ECE001",
        "Name": "Josh",
        "Age": "22"
    },
    "ECE002": {
        "ID": "ECE002",
        "Name": "Peter",
        "Age": "23"
    }
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM