简体   繁体   English

将CSV数据转换为字典中的列表

[英]Converting CSV data to list in dictionary

I have a CSV file in the following form: 我有一个CSV文件,格式如下:

Name_1,2,K,14
Name_1,3,T,14
Name_1,4,T,18
Name_2,2,G,12
Name_2,4,T,14
Name_2,6,K,15
Name_3,2,K,12
Name_3,3,T,15
Name_3,4,G,18

And I want to convert it into a dictionary where Name_x is the key and corresponding data is the value in list form. 我想将其转换为字典,其中Name_x是键,相应的数据是列表形式的值。 Something like this: 像这样的东西:

{'Name_1': [[2, 'K', 14], [3, 'T', 14], [4, 'T', 18]],
 'Name_2': [[4, 'T', 14], [4, 'T', 14], [6, 'K' ,15]],
...}

So far, I think I have to use use defaultdict : 到目前为止,我认为我必须使用use defaultdict

from collections import defaultdict
d = defaultdict(list)

But how do I append the data to d ? 但是如何append数据appendd I know defaultdict does not have an append method. 我知道defaultdict没有append方法。

You need to use the name as the key and append the slice of the row as the value, there will be no order using a normal or defaultdict: 您需要使用名称作为键并将行的切片附加为值,使用normal或defaultdict将没有顺序:

import csv
from collections import defaultdict

with open('in.csv') as f:
    r = csv.reader(f)
    d = defaultdict(list)
    for row in r:
        d[row[0]].append(row[1:])
print(d)

If you want to maintain order you will need an OrderedDict : 如果你想维持秩序,你需要一个OrderedDict

from collections import OrderedDict

with open('in.csv') as f:
    r = csv.reader(f)
    od = OrderedDict()
    for row in r:
        # get key/ first element in row
        key = row[0]
        # create key/list paring if it does not exist, else just append the value
        od.setdefault(key, []).append(row[1:])
print(od)

Output: 输出:

OrderedDict([('Name_1', [['2', 'K', '14'], ['3', 'T', '14'], ['4', 'T', '18']]), ('Name_2', [['2', 'G', '12'], ['4', 'T', '14'], ['6', 'K', '15']]), ('Name_3', [['2', 'K', '12'], ['3', 'T', '15'], ['4', 'G', '18']])])

You could also use groupby if the names are grouped which will group elements based on the first item/name in each row: 如果名称被分组,您还可以使用groupby,它将根据每行中的第一个项目/名称对元素进行分组:

import csv
from collections import OrderedDict
from itertools import groupby
from operator import itemgetter

with open('in.csv') as f:
    r = csv.reader(f)
    od = OrderedDict()
    for k, v in groupby(r, key=itemgetter(0)):
        od[k] = [sub[1:] for sub in v]

If you are using python3 you can unpack using * : 如果您使用的是python3,可以使用*解压缩:

with open("in.csv") as f:
    r = csv.reader(f)
    od = OrderedDict()
    for row in r:
        key, *rest = row
        od.setdefault(key, []).append(rest)


import csv
from collections import OrderedDict
from itertools import groupby
from operator import itemgetter

with open('in.csv') as f:
    r = csv.reader(f)
    od = OrderedDict()
    for k, v in groupby(r, key=itemgetter(0)):
        od[k] = [sub for _, *sub in v]
print(od)
txtcsv="""Name_1,2,K,14
Name_1,3,T,14
Name_1,4,T,18
Name_2,2,G,12
Name_2,4,T,14
Name_2,6,K,15
Name_3,2,K,12
Name_3,3,T,15
Name_3,4,G,18"""

def save():
    with open("test.csv","w") as f:
        f.write(txtcsv)


if __name__ == "__main__":
    save()
    with open("test.csv") as f:
        d = {}
        for l in f.readlines():
            name, val = l.rstrip().split(",", 1)
            d.setdefault(name, []).append(val.split(","))
        print (d)

Off the top of my head (because I'm not too familiar with defaultdict), this should do roughly what you want. 在我的头顶(因为我不太熟悉defaultdict),这应该大致按照你想要的。

data is the CSV string data是CSV字符串

obj = {}

data = data.split('\n')
for row in data:
    row = row.split(',')
    if row[0] in obj:
        obj[row[0]].append(row[1:])
    else:
        obj[row[0]] = [row[1:]]

print obj

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM