简体   繁体   中英

Python: converting csv to dict - using headers as keys

Python: 3.x

Hi. i have below csv file, which has header and rows. rows count may vary file to file. i am trying to convert this csv to a dict format and data is being repeated for first row.

"cdrRecordType","globalCallID_callManagerId","globalCallID_callId"
1,3,9294899
1,3,9294933

Code:

parserd_list = []
output_dict = {}
with open("files\\CUCMdummy.csv") as myfile:
    firstline = True
    for line in myfile:
        if firstline:
            mykeys = ''.join(line.split()).split(',')
            firstline = False
        else:
            values = ''.join(line.split()).split(',')
            for n in range(len(mykeys)):
                output_dict[mykeys[n].rstrip('"').lstrip('"')] = values[n].rstrip('"').lstrip('"')
                print(output_dict)
                parserd_list.append(output_dict)
#print(parserd_list)

(Generally my csv column count is more than 20, but i have presented a sample file.)

(i have used rstrip/lstrip to get rid of double quotes.)

Output getting:

{'cdrRecordType': '1'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294933'}

this is the output of print inside for loop. and final output is also the same.

i dont know what mistake i am doing. Someone please help correct it.

thanks in advance.

use csv.DictReader

import csv

with open("files\\CUCMdummy.csv", mode='r',newline='\n') as myFile:
    reader = list(csv.DictReader(myFile, delimiter=',',quotechar='"'))

Instead of manually parsing a CSV file, you should use the csv module .

This will result in a simpler script and will facilitate gracefully handling edge cases (eg header row, inconsistently quoted fields, etc.).

import csv

with open('example.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        print(row)

Output:

$ python3 parse-csv.py
OrderedDict([('cdrRecordType', '1'), ('globalCallID_callManagerId', '3'), ('globalCallID_callId', '9294899')])
OrderedDict([('cdrRecordType', '1'), ('globalCallID_callManagerId', '3'), ('globalCallID_callId', '9294933')])

If you're intent on parsing manually, here's an approach for doing so:

parsed_list = []
with open('example.csv') as myfile:
    firstline = True
    for line in myfile:
        # Strip leading/trailing whitespace and split into a list of values.
        values = line.strip().split(',')

        # Remove surrounding double quotes from each value, if they exist.
        values = [v.strip('"') for v in values]

        # Use the first line as keys.
        if firstline:
            keys = values
            firstline = False
            # Skip to the next iteration of the for loop.
            continue

        parsed_list.append(dict(zip(keys, values)))

for p in parsed_list:
    print(p)

Output:

$ python3 manual-parse-csv.py
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294933'}

The indentation of your code is wrong.

These two lines:

  print(output_dict)
  parserd_list.append(output_dict)

can simply be un-indented to be on the same line as the for loop above them. On top of this, you need to set a new dict for each new file line.

You can do this: output_dict = {} right before the for loop for the keys.

As mentioned above there are some libraries that will make life easier. But if you want to stick to appending dictionaries, you can load the lines of the file, close it, and process the lines as such also:

with open("scratch.txt") as myfile:
    data = myfile.readlines()

keys = data[0].replace('"','').strip().split(',')

output_dicts = []
for line in data[1:]:
    values = line.strip().split(',')
    output_dicts.append(dict(zip(keys, values)))

print output_dicts 


[{'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899', 'cdrRecordType': '1'}, {'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294933', 'cdrRecordType': '1'}]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM