I have a lot of CSV files consisting different column names but similar data, for example:
account name address
1 2 3
4 5 6
lookup accountname accountaddress
7 8 9
10 11 12
where account
and lookup
are the same fields, name
and accountname
are the same and so on. Is there a way in which I can normalize or classify all of these into one common column name? I'm not able to map this into a hash because the column names are never similar, every time there's a new file, the name of the column is different, and the order in which they appear in the table are also different.
You can try something like this:
row_list = [] with open(file_path) as f: cf = csv.DictReader(f, delimiter=<field separator>, fieldnames=[<columnslist>]) for row in cf: tmp_row = collections.OrderedDict() for column in cf.fieldnames: tmp_row[column] = row[column] row_list.append(tmp_row) return row_list
And then dump this object row_list
into csv file using dictwriter.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.