简体   繁体   中英

Python Predicting column titles according to column data

I have a lot of CSV files consisting different column names but similar data, for example:


account  name    address
   1      2         3     
   4      5         6     

lookup  accountname accountaddress
   7      8         9     
   10     11       12     

where account and lookup are the same fields, name and accountname are the same and so on. Is there a way in which I can normalize or classify all of these into one common column name? I'm not able to map this into a hash because the column names are never similar, every time there's a new file, the name of the column is different, and the order in which they appear in the table are also different.

You can try something like this:

  1. parse your csv data using dictreader row_list = [] with open(file_path) as f: cf = csv.DictReader(f, delimiter=<field separator>, fieldnames=[<columnslist>]) for row in cf: tmp_row = collections.OrderedDict() for column in cf.fieldnames: tmp_row[column] = row[column] row_list.append(tmp_row) return row_list

And then dump this object row_list into csv file using dictwriter.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM