简体   繁体   中英

Reading header of csv file and seeing if it matches a dictionary key, then write value of that key to row

Basically I'll have a bunch of small dictionary, like such:

dictionary_list = [
{"eight": "yes", "queen": "yes", "we": "yes", "eighteen": "yes"},
{"nine": "yes", "king": "yes","we": "yes", "nineteen": "yes"}
]

Then I have a csv file with a whole bunch of columns with words in the header as well, like this: 在此处输入图片说明 There could be 500 columns each with 1 word, and I don't know the order of which a column appears. I do, however, know that any word in my small dictionary should match to the word in a column.

I want to iterate through the headers of the file (skipping first to the 5 column headers) and each time see if the header name can be found in the dictionary, and if so, add the value into that row, if not, add a "no". This will be done row by row, where each row is for one of the small dictionaries. Results using the above dictionary for this file would be:
在此处输入图片说明

So far I've been able to try the following that doesn't really work:

f = open("file.csv", "r")
writer = csv.DictWriter(f)
for dict in dictionary_list: # this is the collection of little dictionaries
    # do some other stuff
    for r in writer: 
        #not sure how to skip 10 columns here. next() seems to work on rows 
        for col in r:
            if col in dict.keys():
                 writer.writerow(dict.values())
             else:
                 writer.writerow("no")

'Pandas' may help you.

Here is the website http://pandas.pydata.org/pandas-docs/stable/ .

You can process csv file by using pandas.read_csv() method and add some data as you want by using Dataframe.append() method.

Hope these would be helpful for you.

Your question appears to be asking to ensure fields from your dictionary_list exist the record. If the field originally existed in the record set the field value to yes, otherwise add the field to the record and set the value to no.

#!/usr/bin/env python3

import csv


dictionary_list = [
    {"eight": "yes", "queen": "yes", "we": "yes", "eighteen": "yes"},
    {"nine": "yes", "king": "yes","them": "yes", "nineteen": "yes"}
]

"""
flatten all the dicionary keys into a uniq list as the
key names will be used for field names and can't be duplicated
"""
field_check = set([k for d in dictionary_list for k in d.keys()])

if __name__ == "__main__":

    with open("file.csv", "r") as f:
        reader = csv.DictReader(f)

        # do not consider the first 10 colums
        field_tail = set(reader.fieldnames[10:])

        """
        initialize yes and no fields as they
        should be the same for every row in the file
        """
        yes_fields = set(field_check & field_tail)
        no_fields = field_check.difference(yes_fields)
        yes_dict = {k:"yes" for k in yes_fields}
        no_dict = {k:"no" for k in no_fields}
        for row in reader:
            row.update(yes_dict)
            row.update(no_dict)
            print(row)

Given an input file headers.csv :

row1,row2,row3,row4,row5,bad,good,eight,nine,queen,three,eighteen,nineteen,king,jack,ace,we,them,you,two

The following code generates your output:

import csv

dictionary_list = [{"eight": "yes", "queen": "yes", "we": "yes", "eighteen": "yes"},
                   {"nine": "yes", "king": "yes","we": "yes", "nineteen": "yes"}]

# Read the input header line as a list
with open('headers.csv',newline='') as f:
    reader = csv.reader(f)
    headers = next(reader)

# Generate the fixed values for the first 5 rows.
rowvals = dict(zip(headers[:5],['x'] * 5))

with open('file.csv', 'w', newline='') as f:
    # When writing a row, restval is the default value when it isn't in the dict row.
    # extrasaction='ignore' prevents complaining if all columns are not present in dict row.
    writer = csv.DictWriter(f,headers,restval='no',extrasaction='ignore')
    writer.writeheader()
    for dictionary in dictionary_list:
        D = dictionary.copy() # needed if the original shouldn't be modified.
        D.update(rowvals)
        writer.writerow(D)

Output:

row1,row2,row3,row4,row5,bad,good,eight,nine,queen,three,eighteen,nineteen,king,jack,ace,we,them,you,two
x,x,x,x,x,no,no,yes,no,yes,no,yes,no,no,no,no,yes,no,no,no
x,x,x,x,x,no,no,no,yes,no,no,no,yes,yes,no,no,yes,no,no,no

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM