添加字符並刪除JSON文件中的最后一個逗號

Question

我正在嘗試通過CSV創建JSON文件。 下面的代碼創建了數據，但是並不是我想要的那樣。 我在python中有一些經驗。 根據我的理解，JSON文件應這樣寫[{}，{}，...，{}]。

我如何？：

我可以插入'，'，但是如何刪除最后一個'，'？
如何在開頭插入'['，在結尾插入']？ 我嘗試將其插入outputfile.write（'['... etc），它顯示了太多地方。
不包括json文件第一行的標頭。

Names.csv：

id,team_name,team_members
123,Biology,"Ali Smith, Jon Doe"
234,Math,Jane Smith 
345,Statistics ,"Matt P, Albert Shaw"
456,Chemistry,"Andrew M, Matt Shaw, Ali Smith"
678,Physics,"Joe Doe, Jane Smith, Ali Smith "

碼：

import csv
import json
import os

with open('names.csv', 'r') as infile, open('names1.json','w') as outfile:
    for line in infile:
         row = dict()
         # print(row)
         id, team_name, *team_members = line.split(',')
         row["id"] = id;
         row["team_name"] = team_name;
         row["team_members"] = team_members
         json.dump(row,outfile)
         outfile.write("," + "\n" )

到目前為止的輸出：

{"id": "id", "team_name": "team_name", "team_members": ["team_members\n"]},
{"id": "123", "team_name": "Biology", "team_members": ["\"Ali Smith", " Jon Doe\"\n"]},
{"id": "234", "team_name": "Math", "team_members": ["Jane Smith \n"]},
{"id": "345", "team_name": "Statistics ", "team_members": ["\"Matt P", " Albert Shaw\"\n"]},
{"id": "456", "team_name": "Chemistry", "team_members": ["\"Andrew M", " Matt Shaw", " Ali Smith\"\n"]},
{"id": "678", "team_name": "Physics", "team_members": ["\"Joe Doe", " Jane Smith", " Ali Smith \""]},

Answer 1

首先，如何跳過標題？ 這很容易：

next(infile) # skip the first line
for line in infile:

但是，您可能要考慮使用csv.DictReader作為輸入。 它處理讀取標題行，並使用那里的信息為每一行創建一個dict，並為您拆分行（以及處理您可能沒有想到的情況，例如可以在CSV中出現的帶引號或轉義的文本文件）：

for row in csv.DictReader(infile):
    jsondump(row,outfile)

現在進入更困難的問題。

更好的解決方案可能是使用迭代JSON庫，該庫可以將迭代器轉儲為JSON數組。 然后，您可以執行以下操作：

def rows(infile):
    for line in infile:
         row = dict()
         # print(row)
         id, team_name, *team_members = line.split(',')
         row["id"] = id;
         row["team_name"] = team_name;
         row["team_members"] = team_members
         yield row

with open('names.csv', 'r') as infile, open('names1.json','w') as outfile:
    genjson.dump(rows(infile), outfile)

stdlib json.JSONEncoder在文檔中有一個示例可以做到這一點-盡管效率不是很高，因為它首先消耗了整個迭代器來構建列表，然后轉儲該列表：

class GenJSONEncoder(json.JSONEncoder):
    def default(self, o):
       try:
           iterable = iter(o)
       except TypeError:
           pass
       else:
           return list(iterable)
       # Let the base class default method raise the TypeError
       return json.JSONEncoder.default(self, o)

j = GenJSONEncoder()
with open('names.csv', 'r') as infile, open('names1.json','w') as outfile:
    outfile.write(j.encode(rows(infile)))

實際上，如果您願意構建一個完整的列表而不是逐行編碼，則只需進行明確的列表化可能會更簡單：

with open('names.csv', 'r') as infile, open('names1.json','w') as outfile:
    json.dump(list(rows(infile)))

您還可以通過覆蓋iterencode方法來iterencode ，但這將變得不那么瑣碎了，您可能想在PyPI上尋找一種有效且經過良好測試的流式迭代JSON庫，而不是從json模塊自己構建它。

但是，與此同時，這是您問題的直接解決方案，與現有代碼的更改盡可能少：

with open('names.csv', 'r') as infile, open('names1.json','w') as outfile:
    # print the opening [
    outfile.write('[\n')
    # keep track of the index, just to distinguish line 0 from the rest
    for i, line in enumerate(infile):
         row = dict()
         # print(row)
         id, team_name, *team_members = line.split(',')
         row["id"] = id;
         row["team_name"] = team_name;
         row["team_members"] = team_members
         # add the ,\n _before_ each row except the first
         if i:
             outfile.write(',\n')
         json.dump(row,outfile)
    # write the final ]
    outfile.write('\n]')

這個技巧-處理第一個元素而不是最后一個元素-簡化了許多此類問題。

到簡化事情的另一個方法是在相鄰的一對線，使用關於一個較小的變化實際迭代pairwise在示例itertools文檔：

def pairwise(iterable):
    a, b = itertools.tee(iterable)
    next(b, None)
    return itertools.zip_longest(a, b, fillvalue=None)

with open('names.csv', 'r') as infile, open('names1.json','w') as outfile:
    # print the opening [
    outfile.write('[\n')
    # iterate pairs of lines
    for line, nextline in pairwise(infile):
         row = dict()
         # print(row)
         id, team_name, *team_members = line.split(',')
         row["id"] = id;
         row["team_name"] = team_name;
         row["team_members"] = team_members
         json.dump(row,outfile)
         # add the , if there is a next line
         if nextline is not None:
             outfile.write(',')
         outfile.write('\n')
    # write the final ]
    outfile.write(']')

這與以前的版本一樣有效，並且在概念上更簡單-但更加抽象。

Answer 2

只需對代碼進行最少的編輯，您就可以在Python中創建一個字典列表，並將其立即轉儲為JSON文件（假設數據集足夠小以適合內存）：

import csv
import json
import os

rows = []  # Create list
with open('names.csv', 'r') as infile, open('names1.json','w') as outfile:
    for line in infile:
         row = dict()
         id, team_name, *team_members = line.split(',')
         row["id"] = id;
         row["team_name"] = team_name;
         row["team_members"] = team_members
         rows.append(row)  # Append row to list

    json.dump(rows[1:], outfile)  # Write entire list to file (except first row)

id說一句，您不應在Python中將id用作變量名，因為它是內置函數。

Answer 3

熊貓可以輕松解決此問題：

df = pd.read_csv('names.csv', dtype=str)
df['team_members'] = (df['team_members']
                      .map(lambda s: s.split(','))
                      .map(lambda l: [x.strip() for x in l]))
records = df.to_dict('records')
json.dump(records, outfile)

Answer 4

似乎使用csv.DictReader重新發明輪子要容易csv.DictReader ：

import csv
import json

data = []
with open('names.csv', 'r', newline='') as infile:
    for row in csv.DictReader(infile):
        data.append(row)

with open('names1.json','w') as outfile:
    json.dump(data, outfile, indent=4)

執行以下names1.json文件的內容（我使用indent=4只是為了使其更易於閱讀）：

[
    {
        "id": "123",
        "team_name": "Biology",
        "team_members": "Ali Smith, Jon Doe"
    },
    {
        "id": "234",
        "team_name": "Math",
        "team_members": "Jane Smith"
    },
    {
        "id": "345",
        "team_name": "Statistics ",
        "team_members": "Matt P, Albert Shaw"
    },
    {
        "id": "456",
        "team_name": "Chemistry",
        "team_members": "Andrew M, Matt Shaw, Ali Smith"
    },
    {
        "id": "678",
        "team_name": "Physics",
        "team_members": "Joe Doe, Jane Smith, Ali Smith"
    }
]

添加字符並刪除JSON文件中的最后一個逗號

問題描述

4 個解決方案

解決方案1
2 已采納 2018-05-01 16:13:57

解決方案2
0 2018-05-01 16:01:53

解決方案3
0 2018-05-01 16:09:14

解決方案4
0 2018-05-01 16:30:27

添加字符並刪除JSON文件中的最后一個逗號

問題描述

4 個解決方案

解決方案1 2 已采納 2018-05-01 16:13:57

解決方案2 0 2018-05-01 16:01:53

解決方案3 0 2018-05-01 16:09:14

解決方案4 0 2018-05-01 16:30:27

解決方案1
2 已采納 2018-05-01 16:13:57

解決方案2
0 2018-05-01 16:01:53

解決方案3
0 2018-05-01 16:09:14

解決方案4
0 2018-05-01 16:30:27