简体   繁体   中英

Importing/Exporting a nested dictionary from a CSV file

So I have a CSV file with the data arranged like this:

X,a,1,b,2,c,3
Y,a,1,b,2,c,3,d,4
Z,l,2,m,3

I want to import the CSV to create a nested dictionary so that looks like this.

data = {'X' : {'a' : 1, 'b' : 2, 'c' : 3}, 
        'y' : {'a' : 1, 'b' : 2, 'c' : 3, 'd' : 4},
        'Z' : {'l' : 2, 'm' :3}}

After updating the dictionary in the program I wrote (I got that part figured out), I want to be able to export the dictionary onto the same CSV file, overwriting/updating it. However I want it to be in the same format as the previous CSV file so that I can import it again.

I have been playing around with the import and have this so far

import csv
data = {}
with open('userdata.csv', 'r') as f:    
    reader = csv.reader(f)
    for row in reader:
       data[row[0]] = {row[i] for i in range(1, len(row))}

But this doesn't work as things are not arranged correctly. Some numbers are subkeys to other numbers, letters are out of place, etc. I haven't even gotten to the export part yet. Any ideas?

Since you're not interested in preserving order, something relatively simple should work:

import csv

# import
data = {}
with open('userdata.csv', 'r') as f:
    reader = csv.reader(f)
    for row in reader:
        a = iter(row[1:])
        data[row[0]] = dict(zip(a, a))

# export
with open('userdata_exported.csv', 'w') as f:
    writer = csv.writer(f)
    for key, values in data.items():
        row = [key] + [value for item in values.items() for value in item]
        writer.writerow(row)

The latter could be done a little more efficiently by making only a single call to the csv.writer 's writerows() method and passing it a generator expression .

# export2
with open('userdata_exported.csv', 'w') as f:
    writer = csv.writer(f)
    rows = ([key] + [value for item in values.items() for value in item]
            for key, values in data.items())
    writer.writerows(rows)

You can use the grouper recipe from itertools :

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return itertools.izip_longest(fillvalue=fillvalue, *args)

This will group your data into the a1/b2/c3 pairs you want. So you can do data[row[0]] = {k: v for k, v in grouper(row[1:], 2)} in your loop.

from collections import defaultdict

data_lines = """X,a,1,b,2,c,3
Y,a,1,b,2,c,3,d,4
Z,l,2,m,3""".splitlines()

data = defaultdict(dict)

for line in data_lines:
# you should probably add guards against invalid data, empty lines etc.
    main_key, sep, tail = line.partition(',')
    items = [item.strip() for item in tail.split(',')]
    items = zip(items[::2], map(int, items[1::2])
    # data[main_key] = {key : value for key, value in items}
    data[main_key] = dict(items)

print dict(data)
# {'Y': {'a': '1', 'c': '3', 'b': '2', 'd': '4'}, 
#  'X': {'a': '1', 'c': '3', 'b': '2'}, 
#  'Z': {'m': '3', 'l': '2'}
# }

I'm lazy, so I might do something like this:

import csv
data = {}

with open('userdata.csv', 'rb') as f:    
    reader = csv.reader(f)
    for row in reader:
        data[row[0]] = dict(zip(row[1::2], map(int,row[2::2])))

which works because row[1::2] gives every other element starting at 1, and row[2::2 every other element starting at 2. zip makes a tuple pair of those elements, and then we pass that to dict . This gives

{'Y': {'a': 1, 'c': 3, 'b': 2, 'd': 4}, 
 'X': {'a': 1, 'c': 3, 'b': 2}, 
 'Z': {'m': 3, 'l': 2}}

(Note that I changed your open to use 'rb' , which is right for Python 2: if you're using 3 , you want 'r', newline='' instead.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM