简体   繁体   中英

python read csv file with row and column headers into dictionary with two keys

I have csv file of the following format,

,col1,col2,col3
row1,23,42,77
row2,25,39,87
row3,48,67,53
row4,14,48,66

I need to read this into a dictionary of two keys such that

dict1['row1']['col2'] = 42
dict1['row4']['col3'] = 66

If I try to use csv.DictReader with default options

with open(filePath, "rb" ) as theFile:
    reader = csv.DictReader(theFile, delimiter=',')
    for line in reader:
    print line

I get the following output

{'': 'row1', 'col2': '42', 'col3': '77', 'col1': '23'}
{'': 'row2', 'col2': '39', 'col3': '87', 'col1': '25'}
{'': 'row3', 'col2': '67', 'col3': '53', 'col1': '48'}
{'': 'row4', 'col2': '48', 'col3': '66', 'col1': '14'}

I'm not sure of how to process this output to create the type of dictionary that I'm interested in.

For sake of completeness, it would also help if you can address how to write back the dictionary into a csv file with the above format

Using the CSV module:

import csv
dict1 = {}

with open("test.csv", "rb") as infile:
    reader = csv.reader(infile)
    headers = next(reader)[1:]
    for row in reader:
        dict1[row[0]] = {key: int(value) for key, value in zip(headers, row[1:])}

You can use pandas for that even if it is a bit an overkill. The pro is that there is almost nothing to code to obtain the expected result.

# Reading the file
df = pd.read_csv('tmp.csv', index_col=0)

# Creating the dict
d = df.transpose().to_dict(orient='series')

print(d['row1']['col2'])
42

The format of the input file is not exactly convenient to parse with csv module. I'd parse headers separately, then parse the rest line by line, splitting by , , stripping and making dictionaries along the way. The working code:

from pprint import pprint

d = {}
with open("myfile.csv") as f:
    headers = [header.strip() for header in next(f).split(",")[1:]]

    for line in f:
        values = [value.strip() for value in line.split(",")]
        d[values[0]] = dict(zip(headers, values[1:]))

pprint(d)

Prints:

{'row1': {'col1': '23', 'col2': '42', 'col3': '77'},
 'row2': {'col1': '25', 'col2': '39', 'col3': '87'},
 'row3': {'col1': '48', 'col2': '67', 'col3': '53'},
 'row4': {'col1': '14', 'col2': '48', 'col3': '66'}}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM