简体   繁体   English

如何将 JSON 转换为 CSV?

[英]How can I convert JSON to CSV?

I have a JSON file I want to convert to a CSV file.我有一个 JSON 文件,我想将其转换为 CSV 文件。 How can I do this with Python?我怎样才能用 Python 做到这一点?

I tried:我试过:

import json
import csv

f = open('data.json')
data = json.load(f)
f.close()

f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
    csv_file.writerow(item)

f.close()

However, it did not work.但是,它没有用。 I am using Django and the error I received is:我正在使用 Django,我收到的错误是:

`file' object has no attribute 'writerow'`

I then tried the following:然后我尝试了以下操作:

import json
import csv

f = open('data.json')
data = json.load(f)
f.close()

f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
    f.writerow(item)  # ← changed

f.close()

I then get the error:然后我得到错误:

`sequence expected`

Sample json file:示例 json 文件:

[{
        "pk": 22,
        "model": "auth.permission",
        "fields": {
            "codename": "add_logentry",
            "name": "Can add log entry",
            "content_type": 8
        }
    }, {
        "pk": 23,
        "model": "auth.permission",
        "fields": {
            "codename": "change_logentry",
            "name": "Can change log entry",
            "content_type": 8
        }
    }, {
        "pk": 24,
        "model": "auth.permission",
        "fields": {
            "codename": "delete_logentry",
            "name": "Can delete log entry",
            "content_type": 8
        }
    }, {
        "pk": 4,
        "model": "auth.permission",
        "fields": {
            "codename": "add_group",
            "name": "Can add group",
            "content_type": 2
        }
    }, {
        "pk": 10,
        "model": "auth.permission",
        "fields": {
            "codename": "add_message",
            "name": "Can add message",
            "content_type": 4
        }
    }
]

With the pandas library , this is as easy as using two commands使用pandas这就像使用两个命令一样简单

df = pd.read_json()

read_json converts a JSON string to a pandas object (either a series or dataframe). read_json将 JSON 字符串转换为 pandas object(序列或数据帧)。 Then:然后:

df.to_csv()

Which can either return a string or write directly to a csv-file.它可以返回字符串或直接写入 csv 文件。 See the docs for to_csv .请参阅to_csv的文档。

Based on the verbosity of previous answers, we should all thank pandas for the shortcut.基于先前答案的冗长,我们都应该感谢 pandas 的快捷方式。

For unstructured JSON see this answer .对于非结构化 JSON 看到这个答案

EDIT: Someone asked for a working minimal example:编辑:有人要求一个工作最小的例子:

import pandas as pd

with open('jsonfile.json', encoding='utf-8') as inputfile:
    df = pd.read_json(inputfile)

df.to_csv('csvfile.csv', encoding='utf-8', index=False)

First, your JSON has nested objects, so it normally cannot be directly converted to CSV.首先,您的 JSON 具有嵌套对象,因此通常无法直接转换为 CSV。 You need to change that to something like this:您需要将其更改为以下内容:

{
    "pk": 22,
    "model": "auth.permission",
    "codename": "add_logentry",
    "content_type": 8,
    "name": "Can add log entry"
},
......]

Here is my code to generate CSV from that:这是我从中生成 CSV 的代码:

import csv
import json

x = """[
    {
        "pk": 22,
        "model": "auth.permission",
        "fields": {
            "codename": "add_logentry",
            "name": "Can add log entry",
            "content_type": 8
        }
    },
    {
        "pk": 23,
        "model": "auth.permission",
        "fields": {
            "codename": "change_logentry",
            "name": "Can change log entry",
            "content_type": 8
        }
    },
    {
        "pk": 24,
        "model": "auth.permission",
        "fields": {
            "codename": "delete_logentry",
            "name": "Can delete log entry",
            "content_type": 8
        }
    }
]"""

x = json.loads(x)

f = csv.writer(open("test.csv", "wb+"))

# Write CSV Header, If you dont need that, remove this line
f.writerow(["pk", "model", "codename", "name", "content_type"])

for x in x:
    f.writerow([x["pk"],
                x["model"],
                x["fields"]["codename"],
                x["fields"]["name"],
                x["fields"]["content_type"]])

You will get output as:您将获得 output 为:

pk,model,codename,name,content_type
22,auth.permission,add_logentry,Can add log entry,8
23,auth.permission,change_logentry,Can change log entry,8
24,auth.permission,delete_logentry,Can delete log entry,8

I am assuming that your JSON file will decode into a list of dictionaries.我假设您的 JSON 文件将解码为字典列表。 First we need a function which will flatten the JSON objects:首先我们需要一个 function ,它将展平 JSON 对象:

def flattenjson(b, delim):
    val = {}
    for i in b.keys():
        if isinstance(b[i], dict):
            get = flattenjson(b[i], delim)
            for j in get.keys():
                val[i + delim + j] = get[j]
        else:
            val[i] = b[i]
            
    return val

The result of running this snippet on your JSON object:在 JSON object 上运行此代码段的结果:

flattenjson({
    "pk": 22, 
    "model": "auth.permission", 
    "fields": {
      "codename": "add_message", 
      "name": "Can add message", 
      "content_type": 8
    }
  }, "__")

is

{
    "pk": 22, 
    "model": "auth.permission", 
    "fields__codename": "add_message", 
    "fields__name": "Can add message", 
    "fields__content_type": 8
}

After applying this function to each dict in the input array of JSON objects:将此 function 应用于 JSON 对象的输入数组中的每个字典后:

input = map(lambda x: flattenjson( x, "__" ), input)

and finding the relevant column names:并找到相关的列名:

columns = [x for row in input for x in row.keys()]
columns = list(set(columns))

it's not hard to run this through the csv module:通过 csv 模块运行它并不难:

with open(fname, 'wb') as out_file:
    csv_w = csv.writer(out_file)
    csv_w.writerow(columns)

    for i_r in input:
        csv_w.writerow(map(lambda x: i_r.get(x, ""), columns))

I hope this helps我希望这有帮助

JSON can represent a wide variety of data structures -- a JS "object" is roughly like a Python dict (with string keys), a JS "array" roughly like a Python list, and you can nest them as long as the final "leaf" elements are numbers or strings. JSON 可以表示各种各样的数据结构——一个 JS“对象”大致类似于 Python dict(带有字符串键),一个 JS“数组”大致类似于 ZA7F5F35426B927411FC9231Z 列表,你可以将它们嵌套 3 as8 as8 as long as the final list,并且你可以嵌套它们叶”元素是数字或字符串。

CSV can essentially represent only a 2-D table -- optionally with a first row of "headers", ie, "column names", which can make the table interpretable as a list of dicts, instead of the normal interpretation, a list of lists (again, "leaf" elements can be numbers or strings). CSV 本质上只能表示一个二维表——可选地带有第一行“标题”,即“列名”,这可以使表可解释为字典列表,而不是正常解释,列表列表(同样,“叶子”元素可以是数字或字符串)。

So, in the general case, you can't translate an arbitrary JSON structure to a CSV.因此,在一般情况下,您不能将任意 JSON 结构转换为 CSV。 In a few special cases you can (array of arrays with no further nesting; arrays of objects which all have exactly the same keys).在一些特殊情况下,您可以(没有进一步嵌套的 arrays 数组;所有具有完全相同键的对象的 arrays)。 Which special case, if any, applies to your problem?哪种特殊情况(如果有)适用于您的问题? The details of the solution depend on which special case you do have.解决方案的细节取决于您所拥有的特殊情况。 Given the astonishing fact that you don't even mention which one applies, I suspect you may not have considered the constraint, neither usable case in fact applies, and your problem is impossible to solve.鉴于您甚至没有提及适用哪一个的惊人事实,我怀疑您可能没有考虑约束,实际上没有可用的案例适用,并且您的问题无法解决。 But please do clarify但请务必澄清

A generic solution which translates any json list of flat objects to csv.一种通用解决方案,可将平面对象的任何 json 列表转换为 csv。

Pass the input.json file as first argument on command line.将 input.json 文件作为命令行的第一个参数传递。

import csv, json, sys

input = open(sys.argv[1])
data = json.load(input)
input.close()

output = csv.writer(sys.stdout)

output.writerow(data[0].keys())  # header row

for row in data:
    output.writerow(row.values())

Use json_normalize from pandas :使用来自json_normalizepandas

  • Using the sample data from the OP in a file named test.json .在名为test.json的文件中使用来自 OP 的样本数据。
  • encoding='utf-8' has been used here, but may not be necessary for other cases. encoding='utf-8'已在此处使用,但在其他情况下可能不需要。
  • The following code takes advantage of the pathlib library.以下代码利用了pathlib库。
    • .open is a method of pathlib . .openpathlib的一种方法。
    • Works with non-Windows paths too.也适用于非 Windows 路径。
  • Use pandas.to_csv(...) to save the data to a csv file.使用pandas.to_csv(...)将数据保存到 csv 文件。
import pandas as pd
# As of Pandas 1.01, json_normalize as pandas.io.json.json_normalize is deprecated and is now exposed in the top-level namespace.
# from pandas.io.json import json_normalize
from pathlib import Path
import json

# set path to file
p = Path(r'c:\some_path_to_file\test.json')

# read json
with p.open('r', encoding='utf-8') as f:
    data = json.loads(f.read())

# create dataframe
df = pd.json_normalize(data)

# dataframe view
 pk            model  fields.codename           fields.name  fields.content_type
 22  auth.permission     add_logentry     Can add log entry                    8
 23  auth.permission  change_logentry  Can change log entry                    8
 24  auth.permission  delete_logentry  Can delete log entry                    8
  4  auth.permission        add_group         Can add group                    2
 10  auth.permission      add_message       Can add message                    4

# save to csv
df.to_csv('test.csv', index=False, encoding='utf-8')

CSV Output: CSV Output:

pk,model,fields.codename,fields.name,fields.content_type
22,auth.permission,add_logentry,Can add log entry,8
23,auth.permission,change_logentry,Can change log entry,8
24,auth.permission,delete_logentry,Can delete log entry,8
4,auth.permission,add_group,Can add group,2
10,auth.permission,add_message,Can add message,4

Resources for more heavily nested JSON objects:更多嵌套的 JSON 对象的资源:

This code should work for you, assuming that your JSON data is in a file called data.json .假设您的 JSON 数据位于名为data.json的文件中,此代码应该适合您。

import json
import csv

with open("data.json") as file:
    data = json.load(file)

with open("data.csv", "w") as file:
    csv_file = csv.writer(file)
    for item in data:
        fields = list(item['fields'].values())
        csv_file.writerow([item['pk'], item['model']] + fields)

It'll be easy to use csv.DictWriter() ,the detailed implementation can be like this:使用csv.DictWriter()会很容易,具体实现如下:

def read_json(filename):
    return json.loads(open(filename).read())
def write_csv(data,filename):
    with open(filename, 'w+') as outf:
        writer = csv.DictWriter(outf, data[0].keys())
        writer.writeheader()
        for row in data:
            writer.writerow(row)
# implement
write_csv(read_json('test.json'), 'output.csv')

Note that this assumes that all of your JSON objects have the same fields.请注意,这假设您的所有 JSON 对象都具有相同的字段。

Here is the reference which may help you.这是可以帮助您的参考

I was having trouble with Dan's proposed solution , but this worked for me:我在使用Dan 提出的解决方案时遇到了麻烦,但这对我有用:

import json
import csv 

f = open('test.json')
data = json.load(f)
f.close()

f=csv.writer(open('test.csv','wb+'))

for item in data:
  f.writerow([item['pk'], item['model']] + item['fields'].values())

Where "test.json" contained the following:其中“test.json”包含以下内容:

[ 
{"pk": 22, "model": "auth.permission", "fields": 
  {"codename": "add_logentry", "name": "Can add log entry", "content_type": 8 } }, 
{"pk": 23, "model": "auth.permission", "fields": 
  {"codename": "change_logentry", "name": "Can change log entry", "content_type": 8 } }, {"pk": 24, "model": "auth.permission", "fields": 
  {"codename": "delete_logentry", "name": "Can delete log entry", "content_type": 8 } }
]

Alec's answer is great, but it doesn't work in the case where there are multiple levels of nesting.亚历克的回答很好,但在有多层嵌套的情况下不起作用。 Here's a modified version that supports multiple levels of nesting.这是一个支持多级嵌套的修改版本。 It also makes the header names a bit nicer if the nested object already specifies its own key (eg Firebase Analytics / BigTable / BigQuery data):如果嵌套的 object 已经指定了自己的密钥(例如 Firebase Analytics / BigTable / BigQuery 数据),它还会使 header 名称更好一些:

"""Converts JSON with nested fields into a flattened CSV file.
"""

import sys
import json
import csv
import os

import jsonlines

from orderedset import OrderedSet

# from https://stackoverflow.com/a/28246154/473201
def flattenjson( b, prefix='', delim='/', val=None ):
  if val is None:
    val = {}

  if isinstance( b, dict ):
    for j in b.keys():
      flattenjson(b[j], prefix + delim + j, delim, val)
  elif isinstance( b, list ):
    get = b
    for j in range(len(get)):
      key = str(j)

      # If the nested data contains its own key, use that as the header instead.
      if isinstance( get[j], dict ):
        if 'key' in get[j]:
          key = get[j]['key']

      flattenjson(get[j], prefix + delim + key, delim, val)
  else:
    val[prefix] = b

  return val

def main(argv):
  if len(argv) < 2:
    raise Error('Please specify a JSON file to parse')

  print "Loading and Flattening..."
  filename = argv[1]
  allRows = []
  fieldnames = OrderedSet()
  with jsonlines.open(filename) as reader:
    for obj in reader:
      # print 'orig:\n'
      # print obj
      flattened = flattenjson(obj)
      #print 'keys: %s' % flattened.keys()
      # print 'flattened:\n'
      # print flattened
      fieldnames.update(flattened.keys())
      allRows.append(flattened)

  print "Exporting to CSV..."
  outfilename = filename + '.csv'
  count = 0
  with open(outfilename, 'w') as file:
    csvwriter = csv.DictWriter(file, fieldnames=fieldnames)
    csvwriter.writeheader()
    for obj in allRows:
      # print 'allRows:\n'
      # print obj
      csvwriter.writerow(obj)
      count += 1

  print "Wrote %d rows" % count



if __name__ == '__main__':
  main(sys.argv)

As mentioned in the previous answers the difficulty in converting json to csv is because a json file can contain nested dictionaries and therefore be a multidimensional data structure verses a csv which is a 2D data structure. As mentioned in the previous answers the difficulty in converting json to csv is because a json file can contain nested dictionaries and therefore be a multidimensional data structure verses a csv which is a 2D data structure. However, a good way to turn a multidimensional structure to a csv is to have multiple csvs that tie together with primary keys.但是,将多维结构转换为 csv 的一个好方法是让多个 csv 与主键绑定在一起。

In your example, the first csv output has the columns "pk","model","fields" as your columns.在您的示例中,第一个 csv output 将列“pk”、“模型”、“字段”作为您的列。 Values for "pk", and "model" are easy to get but because the "fields" column contains a dictionary, it should be its own csv and because "codename" appears to the be the primary key, you can use as the input for "fields" to complete the first csv. “pk”和“model”的值很容易获得,但因为“fields”列包含一个字典,它应该是它自己的 csv 并且因为“codename”似乎是主键,您可以用作输入对于“字段”来完成第一个 csv。 The second csv contains the dictionary from the "fields" column with codename as the the primary key that can be used to tie the 2 csvs together.第二个 csv 包含来自“字段”列的字典,其中代号作为主键,可用于将 2 个 csv 绑定在一起。

Here is a solution for your json file which converts a nested dictionaries to 2 csvs.这是您的 json 文件的解决方案,可将嵌套字典转换为 2 个 csv。

import csv
import json

def readAndWrite(inputFileName, primaryKey=""):
    input = open(inputFileName+".json")
    data = json.load(input)
    input.close()

    header = set()

    if primaryKey != "":
        outputFileName = inputFileName+"-"+primaryKey
        if inputFileName == "data":
            for i in data:
                for j in i["fields"].keys():
                    if j not in header:
                        header.add(j)
    else:
        outputFileName = inputFileName
        for i in data:
            for j in i.keys():
                if j not in header:
                    header.add(j)

    with open(outputFileName+".csv", 'wb') as output_file:
        fieldnames = list(header)
        writer = csv.DictWriter(output_file, fieldnames, delimiter=',', quotechar='"')
        writer.writeheader()
        for x in data:
            row_value = {}
            if primaryKey == "":
                for y in x.keys():
                    yValue = x.get(y)
                    if type(yValue) == int or type(yValue) == bool or type(yValue) == float or type(yValue) == list:
                        row_value[y] = str(yValue).encode('utf8')
                    elif type(yValue) != dict:
                        row_value[y] = yValue.encode('utf8')
                    else:
                        if inputFileName == "data":
                            row_value[y] = yValue["codename"].encode('utf8')
                            readAndWrite(inputFileName, primaryKey="codename")
                writer.writerow(row_value)
            elif primaryKey == "codename":
                for y in x["fields"].keys():
                    yValue = x["fields"].get(y)
                    if type(yValue) == int or type(yValue) == bool or type(yValue) == float or type(yValue) == list:
                        row_value[y] = str(yValue).encode('utf8')
                    elif type(yValue) != dict:
                        row_value[y] = yValue.encode('utf8')
                writer.writerow(row_value)

readAndWrite("data")

I know it has been a long time since this question has been asked but I thought I might add to everyone else's answer and share a blog post that I think explain the solution in a very concise way.我知道这个问题已经很久没有问过了,但我想我可能会添加到其他人的答案中,并分享一篇我认为以非常简洁的方式解释解决方案的博客文章。

Here is the link这是链接

Open a file for writing打开要写入的文件

employ_data = open('/tmp/EmployData.csv', 'w')

Create the csv writer object创建 csv 写入器 object

csvwriter = csv.writer(employ_data)
count = 0
for emp in emp_data:
      if count == 0:
             header = emp.keys()
             csvwriter.writerow(header)
             count += 1
      csvwriter.writerow(emp.values())

Make sure to close the file in order to save the contents确保关闭文件以保存内容

employ_data.close()

It is not a very smart way to do it, but I have had the same problem and this worked for me:这不是一个非常聪明的方法,但我遇到了同样的问题,这对我有用:

import csv

f = open('data.json')
data = json.load(f)
f.close()

new_data = []

for i in data:
   flat = {}
   names = i.keys()
   for n in names:
      try:
         if len(i[n].keys()) > 0:
            for ii in i[n].keys():
               flat[n+"_"+ii] = i[n][ii]
      except:
         flat[n] = i[n]
   new_data.append(flat)  

f = open(filename, "r")
writer = csv.DictWriter(f, new_data[0].keys())
writer.writeheader()
for row in new_data:
   writer.writerow(row)
f.close()

Surprisingly, I found that none of the answers posted here so far correctly deal with all possible scenarios (eg, nested dicts, nested lists, None values, etc).令人惊讶的是,我发现到目前为止,这里发布的答案都没有正确处理所有可能的情况(例如,嵌套字典、嵌套列表、无值等)。

This solution should work across all scenarios:此解决方案应适用于所有场景:

def flatten_json(json):
    def process_value(keys, value, flattened):
        if isinstance(value, dict):
            for key in value.keys():
                process_value(keys + [key], value[key], flattened)
        elif isinstance(value, list):
            for idx, v in enumerate(value):
                process_value(keys + [str(idx)], v, flattened)
        else:
            flattened['__'.join(keys)] = value

    flattened = {}
    for key in json.keys():
        process_value([key], json[key], flattened)
    return flattened

This works relatively well.这工作相对较好。 It flattens the json to write it to a csv file.它将 json 展平以将其写入 csv 文件。 Nested elements are managed:)嵌套元素被管理:)

That's for python 3这是 python 3

import json

o = json.loads('your json string') # Be careful, o must be a list, each of its objects will make a line of the csv.

def flatten(o, k='/'):
    global l, c_line
    if isinstance(o, dict):
        for key, value in o.items():
            flatten(value, k + '/' + key)
    elif isinstance(o, list):
        for ov in o:
            flatten(ov, '')
    elif isinstance(o, str):
        o = o.replace('\r',' ').replace('\n',' ').replace(';', ',')
        if not k in l:
            l[k]={}
        l[k][c_line]=o

def render_csv(l):
    ftime = True

    for i in range(100): #len(l[list(l.keys())[0]])
        for k in l:
            if ftime :
                print('%s;' % k, end='')
                continue
            v = l[k]
            try:
                print('%s;' % v[i], end='')
            except:
                print(';', end='')
        print()
        ftime = False
        i = 0

def json_to_csv(object_list):
    global l, c_line
    l = {}
    c_line = 0
    for ov in object_list : # Assumes json is a list of objects
        flatten(ov)
        c_line += 1
    render_csv(l)

json_to_csv(o)

enjoy.请享用。

My simple way to solve this:我解决这个问题的简单方法:

Create a new Python file like: json_to_csv.py创建一个新的 Python 文件,如:json_to_csv.py

Add this code:添加此代码:

import csv, json, sys
#if you are not using utf-8 files, remove the next line
sys.setdefaultencoding("UTF-8")
#check if you pass the input file and output file
if sys.argv[1] is not None and sys.argv[2] is not None:

    fileInput = sys.argv[1]
    fileOutput = sys.argv[2]

    inputFile = open(fileInput)
    outputFile = open(fileOutput, 'w')
    data = json.load(inputFile)
    inputFile.close()

    output = csv.writer(outputFile)

    output.writerow(data[0].keys())  # header row

    for row in data:
        output.writerow(row.values())

After add this code, save the file and run at the terminal:添加此代码后,保存文件并在终端运行:

python json_to_csv.py input.txt output.csv python json_to_csv.py input.txt output.csv

I hope this help you.我希望这对你有帮助。

SEEYA再见

Try this尝试这个

import csv, json, sys

input = open(sys.argv[1])
data = json.load(input)
input.close()

output = csv.writer(sys.stdout)

output.writerow(data[0].keys())  # header row

for item in data:
    output.writerow(item.values())

This code works for any given json file此代码适用于任何给定的 json 文件

# -*- coding: utf-8 -*-
"""
Created on Mon Jun 17 20:35:35 2019
author: Ram
"""

import json
import csv

with open("file1.json") as file:
    data = json.load(file)



# create the csv writer object
pt_data1 = open('pt_data1.csv', 'w')
csvwriter = csv.writer(pt_data1)

count = 0

for pt in data:

      if count == 0:

             header = pt.keys()

             csvwriter.writerow(header)

             count += 1

      csvwriter.writerow(pt.values())

pt_data1.close()

This is a modification of @MikeRepass's answer.这是对@MikeRepass 答案的修改。 This version writes the CSV to a file, and works for both Python 2 and Python 3.此版本将 CSV 写入文件,适用于 Python 2 和 Python 3。

import csv,json
input_file="data.json"
output_file="data.csv"
with open(input_file) as f:
    content=json.load(f)
try:
    context=open(output_file,'w',newline='') # Python 3
except TypeError:
    context=open(output_file,'wb') # Python 2
with context as file:
    writer=csv.writer(file)
    writer.writerow(content[0].keys()) # header row
    for row in content:
        writer.writerow(row.values())

Modified Alec McGail's answer to support JSON with lists inside修改了 Alec McGail 的答案以支持 JSON 与列表

    def flattenjson(self, mp, delim="|"):
            ret = []
            if isinstance(mp, dict):
                    for k in mp.keys():
                            csvs = self.flattenjson(mp[k], delim)
                            for csv in csvs:
                                    ret.append(k + delim + csv)
            elif isinstance(mp, list):
                    for k in mp:
                            csvs = self.flattenjson(k, delim)
                            for csv in csvs:
                                    ret.append(csv)
            else:
                    ret.append(mp)

            return ret

Thanks谢谢

import json,csv
t=''
t=(type('a'))
json_data = []
data = None
write_header = True
item_keys = []
try:
with open('kk.json') as json_file:
    json_data = json_file.read()

    data = json.loads(json_data)
except Exception as e:
    print( e)

with open('bar.csv', 'at') as csv_file:
    writer = csv.writer(csv_file)#, quoting=csv.QUOTE_MINIMAL)
    for item in data:
        item_values = []
        for key in item:
            if write_header:
                item_keys.append(key)
            value = item.get(key, '')
            if (type(value)==t):
                item_values.append(value.encode('utf-8'))
            else:
                item_values.append(value)
        if write_header:
            writer.writerow(item_keys)
            write_header = False
        writer.writerow(item_values)

If we consider the below example for converting the json format file to csv formatted file.如果我们考虑以下将 json 格式文件转换为 csv 格式文件的示例。

{
 "item_data" : [
      {
        "item": "10023456",
        "class": "100",
        "subclass": "123"
      }
      ]
}

The below code will convert the json file ( data3.json ) to csv file ( data3.csv ).下面的代码会将 json 文件 (data3.json) 转换为 csv 文件 (data3.Z628CB5675FF524FFE3EZAAE719B)。

import json
import csv
with open("/Users/Desktop/json/data3.json") as file:
    data = json.load(file)
    file.close()
    print(data)

fname = "/Users/Desktop/json/data3.csv"

with open(fname, "w", newline='') as file:
    csv_file = csv.writer(file)
    csv_file.writerow(['dept',
                       'class',
                       'subclass'])
    for item in data["item_data"]:
         csv_file.writerow([item.get('item_data').get('dept'),
                            item.get('item_data').get('class'),
                            item.get('item_data').get('subclass')])

The above mentioned code has been executed in the locally installed pycharm and it has successfully converted the json file to the csv file.上述代码已在本地安装的 pycharm 中执行,并成功将 json 文件转换为 csv 文件。 Hope this help to convert the files.希望这有助于转换文件。

Since the data appears to be in a dictionary format, it would appear that you should actually use csv.DictWriter() to actually output the lines with the appropriate header information.由于数据似乎是字典格式,因此您似乎应该实际使用 csv.DictWriter() 来实际 output 具有适当 header 信息的行。 This should allow the conversion to be handled somewhat easier.这应该允许转换处理得更容易一些。 The fieldnames parameter would then set up the order properly while the output of the first line as the headers would allow it to be read and processed later by csv.DictReader().然后 fieldnames 参数将正确设置顺序,而第一行的 output 作为标题将允许稍后由 csv.DictReader() 读取和处理它。

For example, Mike Repass used例如,Mike Repass 使用

output = csv.writer(sys.stdout)

output.writerow(data[0].keys())  # header row

for row in data:
  output.writerow(row.values())

However just change the initial setup to output = csv.DictWriter(filesetting, fieldnames=data[0].keys())但是只需将初始设置更改为 output = csv.DictWriter(filesetting, fieldnames=data[0].keys())

Note that since the order of elements in a dictionary is not defined, you might have to create fieldnames entries explicitly.请注意,由于未定义字典中元素的顺序,您可能必须显式创建字段名条目。 Once you do that, the writerow will work.一旦你这样做了,作家就会工作。 The writes then work as originally shown.然后写入工作如最初所示。

Unfortunately I have not enouthg reputation to make a small contribution to the amazing @Alec McGail answer.不幸的是,我没有足够的声誉来为惊人的@Alec McGail 答案做出一点贡献。 I was using Python3 and I have needed to convert the map to a list following the @Alexis R comment.我使用的是 Python3,我需要将 map 转换为 @Alexis R 评论之后的列表。

Additionaly I have found the csv writer was adding a extra CR to the file (I have a empty line for each line with data inside the csv file).另外,我发现 csv 编写器正在向文件中添加一个额外的 CR(我在 csv 文件中的每一行都有一个空行,其中包含数据)。 The solution was very easy following the @Jason R.遵循@Jason R,解决方案非常简单。 Coombs answer to this thread: CSV in Python adding an extra carriage return Coombs 对此线程的回答: Python 中的 CSV 添加额外的回车符

You need to simply add the lineterminator='\n' parameter to the csv.writer.您只需将 lineterminator='\n' 参数添加到 csv.writer 即可。 It will be: csv_w = csv.writer( out_file, lineterminator='\n' )它将是: csv_w = csv.writer( out_file, lineterminator='\n' )

You can use this code to convert a json file to csv file After reading the file, I am converting the object to pandas dataframe and then saving this to a CSV file You can use this code to convert a json file to csv file After reading the file, I am converting the object to pandas dataframe and then saving this to a CSV file

import os
import pandas as pd
import json
import numpy as np

data = []
os.chdir('D:\\Your_directory\\folder')
with open('file_name.json', encoding="utf8") as data_file:    
     for line in data_file:
        data.append(json.loads(line))

dataframe = pd.DataFrame(data)        
## Saving the dataframe to a csv file
dataframe.to_csv("filename.csv", encoding='utf-8',index= False)

I have tried a lot of the suggested solution (also Panda was not correctly normalizing my JSON) but the real good one which is parsing correctly the JSON data is from Max Berman .我已经尝试了很多建议的解决方案(Panda 也没有正确规范化我的 JSON),但真正能正确解析 JSON 数据的好方法来自Max Berman

I wrote an improvement to avoid new columns for each row and puts it to the existing column during parsing.我写了一个改进来避免每一行都出现新列,并在解析过程中将其放到现有列中。 It has also the effect to store a value as a string if only one data exists, and make a list if there are more values for that columns.如果仅存在一个数据,它还具有将值存储为字符串的效果,如果该列有更多值,则创建一个列表。

It takes an input.json file for input and spits out an output.csv.它需要一个 input.json 文件作为输入,然后输出一个 output.csv。

import json
import pandas as pd

def flatten_json(json):
    def process_value(keys, value, flattened):
        if isinstance(value, dict):
            for key in value.keys():
                process_value(keys + [key], value[key], flattened)
        elif isinstance(value, list):
            for idx, v in enumerate(value):
                process_value(keys, v, flattened)
                # process_value(keys + [str(idx)], v, flattened)
        else:
            key1 = '__'.join(keys)
            if not flattened.get(key1) is None:
                if isinstance(flattened[key1], list):
                    flattened[key1] = flattened[key1] + [value]
                else:
                    flattened[key1] = [flattened[key1]] + [value]
            else:
                flattened[key1] = value

    flattened = {}
    for key in json.keys():
        k = key
        # print("Key: " + k)
        process_value([key], json[key], flattened)
    return flattened

try:
    f = open("input.json", "r")
except:
    pass
y = json.loads(f.read())
flat = flatten_json(y)
text = json.dumps(flat)
df = pd.read_json(text)
df.to_csv('output.csv', index=False, encoding='utf-8')

I might be late to the party, but I think, I have dealt with the similar problem.我可能会迟到,但我想,我已经处理了类似的问题。 I had a json file which looked like this我有一个看起来像这样的 json 文件

JSON 文件结构

I only wanted to extract few keys/values from these json file.我只想从这些 json 文件中提取几个键/值。 So, I wrote the following code to extract the same.因此,我编写了以下代码来提取相同的内容。

    """json_to_csv.py
    This script reads n numbers of json files present in a folder and then extract certain data from each file and write in a csv file.
    The folder contains the python script i.e. json_to_csv.py, output.csv and another folder descriptions containing all the json files.
"""

import os
import json
import csv


def get_list_of_json_files():
    """Returns the list of filenames of all the Json files present in the folder
    Parameter
    ---------
    directory : str
        'descriptions' in this case
    Returns
    -------
    list_of_files: list
        List of the filenames of all the json files
    """

    list_of_files = os.listdir('descriptions')  # creates list of all the files in the folder

    return list_of_files


def create_list_from_json(jsonfile):
    """Returns a list of the extracted items from json file in the same order we need it.
    Parameter
    _________
    jsonfile : json
        The json file containing the data
    Returns
    -------
    one_sample_list : list
        The list of the extracted items needed for the final csv
    """

    with open(jsonfile) as f:
        data = json.load(f)

    data_list = []  # create an empty list

    # append the items to the list in the same order.
    data_list.append(data['_id'])
    data_list.append(data['_modelType'])
    data_list.append(data['creator']['_id'])
    data_list.append(data['creator']['name'])
    data_list.append(data['dataset']['_accessLevel'])
    data_list.append(data['dataset']['_id'])
    data_list.append(data['dataset']['description'])
    data_list.append(data['dataset']['name'])
    data_list.append(data['meta']['acquisition']['image_type'])
    data_list.append(data['meta']['acquisition']['pixelsX'])
    data_list.append(data['meta']['acquisition']['pixelsY'])
    data_list.append(data['meta']['clinical']['age_approx'])
    data_list.append(data['meta']['clinical']['benign_malignant'])
    data_list.append(data['meta']['clinical']['diagnosis'])
    data_list.append(data['meta']['clinical']['diagnosis_confirm_type'])
    data_list.append(data['meta']['clinical']['melanocytic'])
    data_list.append(data['meta']['clinical']['sex'])
    data_list.append(data['meta']['unstructured']['diagnosis'])
    # In few json files, the race was not there so using KeyError exception to add '' at the place
    try:
        data_list.append(data['meta']['unstructured']['race'])
    except KeyError:
        data_list.append("")  # will add an empty string in case race is not there.
    data_list.append(data['name'])

    return data_list


def write_csv():
    """Creates the desired csv file
    Parameters
    __________
    list_of_files : file
        The list created by get_list_of_json_files() method
    result.csv : csv
        The csv file containing the header only
    Returns
    _______
    result.csv : csv
        The desired csv file
    """

    list_of_files = get_list_of_json_files()
    for file in list_of_files:
        row = create_list_from_json(f'descriptions/{file}')  # create the row to be added to csv for each file (json-file)
        with open('output.csv', 'a') as c:
            writer = csv.writer(c)
            writer.writerow(row)
        c.close()


if __name__ == '__main__':
    write_csv()

I hope this will help.我希望这将有所帮助。 For details on how this code work you can check here有关此代码如何工作的详细信息,您可以 在此处查看

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM