简体   繁体   中英

Creating a dictionary from a CSV file

I am trying to take input from a CSV file and then push it into a dictionary format (I am using Python 3.x).

I use the code below to read in the CSV file and that works:

import csv

reader = csv.reader(open('C:\\Users\\Chris\\Desktop\\test.csv'), delimiter=',', quotechar='|')

for row in reader:
    print(', '.join(row))

But now I want to place the results into a dictionary. I would like the first row of the CSV file to be used as the "key" field for the dictionary with the subsequent rows in the CSV file filling out the data portion.

Sample data:

     Date        First Name     Last Name     Score
12/28/2012 15:15        John          Smith        20
12/29/2012 15:15        Alex          Jones        38
12/30/2012 15:15      Michael       Carpenter      25

How can I get the dictionary to work?

Create a dictionary, then iterate over the result and stuff the rows in the dictionary. Note that if you encounter a row with a duplicate date, you will have to decide what to do (raise an exception, replace the previous row, discard the later row, etc...)

Here's test.csv:

Date,Foo,Bar
123,456,789
abc,def,ghi

and the corresponding program:

import csv
reader = csv.reader(open('test.csv'))

result = {}
for row in reader:
    key = row[0]
    if key in result:
        # implement your duplicate row handling here
        pass
    result[key] = row[1:]
print(result)

yields:

{'Date': ['Foo', 'Bar'], '123': ['456', '789'], 'abc': ['def', 'ghi']}

or, with DictReader:

import csv
reader = csv.DictReader(open('test.csv'))

result = {}
for row in reader:
    key = row.pop('Date')
    if key in result:
        # implement your duplicate row handling here
        pass
    result[key] = row
print(result)

results in:

{'123': {'Foo': '456', 'Bar': '789'}, 'abc': {'Foo': 'def', 'Bar': 'ghi'}}

Or perhaps you want to map the column headings to a list of values for that column:

import csv
reader = csv.DictReader(open('test.csv'))

result = {}
for row in reader:
    for column, value in row.items():  # consider .iteritems() for Python 2
        result.setdefault(column, []).append(value)
print(result)

That yields:

{'Date': ['123', 'abc'], 'Foo': ['456', 'def'], 'Bar': ['789', 'ghi']}

You need a Python DictReader class. More help can be found from here

import csv

with open('file_name.csv', 'rt') as f:
    reader = csv.DictReader(f)
    for row in reader:
        print row

Help from @phil-frost was very helpful, was exactly what I was looking for.

I have made few tweaks after that so I'm would like to share it here:

def csv_as_dict(file, ref_header, delimiter=None):

    import csv
    if not delimiter:
        delimiter = ';'
    reader = csv.DictReader(open(file), delimiter=delimiter)
    result = {}
    for row in reader:
        print(row)
        key = row.pop(ref_header)
        if key in result:
            # implement your duplicate row handling here
            pass
        result[key] = row
    return result

You can call it:

myvar = csv_as_dict(csv_file, 'ref_column')

Where ref_colum will be your main key for each row.

import csv
def parser_csv(PATH):
    reader = csv.reader(open("{}.csv".format(PATH), 'r'))
    dict = {}
    list_dict = []
    counter = 0
    for row in reader:
        if counter == 0:
            first_row = row
            ecc = len(first_row)
            counter += 1
        else:
            for col in range(ecc):
                dict.update({first_row[col]:row[col]})
            list_dict.append(dict)
    return list_dict
print(len(parser_csv("path")))
# Have one less csv file (first row is keys of dict)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM