简体   繁体   中英

How to split a csv file on date using python

I have a csv file that contains a date column formatted as "1929-01-10". I would like to split this huge file into separate files per year. So for every year in the date column a separate csv file (ideally with the name of the year).

I would like to do this in Python

  1. Get src location where we have to write new files. and main CSV file name
  2. Use CSV module to reader and write files.
  3. Use collection defaultdict module to set every key value type is list.
  4. Reader main file and iterate every row from.
  5. split first column of each row by - to get year value.
  6. Use year value as key and append row in result dictionary.
  7. Now we have all information into result dictionary.
  8. Iterate every item from the result dictionary.
  9. again use CSV module to write CSV file.
  10. Use key as name of file.

input: main.csv

1929-01-10,1,a
1929-01-10,2,b
1930-01-10,3,c
1929-01-10,4,d
2015-01-10,5,e
2015-01-10,6,f
1929-01-10,7,g
2014-01-10,8,h

code:

src_path = "/home/vivek/Desktop/Work/stack/"
main_file = "/home/vivek/Desktop/Work/stack/main.csv"
import csv
import collections
import pprint

with open(main_file, "rb") as fp:
    root = csv.reader(fp, delimiter=',')
    result = collections.defaultdict(list)
    for row in root:
        year = row[0].split("-")[0]
        result[year].append(row)

print "Result:-"        
pprint.pprint(result)

for i,j in result.items():
    file_path = "%s%s.csv"%(src_path, i)
    with open(file_path, 'wb') as fp:
        writer = csv.writer(fp, delimiter=',')
        writer.writerows(j)

output:

Result:-
defaultdict(<type 'list'>, {'2015': [['2015-01-10', '5', 'e'], ['2015-01-10', '6', 'f']], '1929': [['1929-01-10', '1', 'a'], ['1929-01-10', '2', 'b'], ['1929-01-10', '4', 'd'], ['1929-01-10', '7', 'g']], '1930': [['1930-01-10', '3', 'c']], '2014': [['2014-01-10', '8', 'h']]})

some result file according to input file:

1929.csv

1929-01-10,1,a
1929-01-10,2,b
1929-01-10,4,d
1929-01-10,7,g

2015.csv

2015-01-10,5,e
2015-01-10,6,f

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM