简体   繁体   中英

Python sort CSV File

Hey I have a CSV file with many rows but one of the row constantly repeats. Is it possible to only keep the first name for that row and keep all other data. I tried with pandas but pandas asks for a function such as aggregate sum. My data in the CSV file is like.

H1 h2 h3 h4
A 1 2 3 4
A 2 3 4 5
A 3 4 5 6
B 1 2 3 4
B 2 3 4 5
B 3 4 5 6
C 1 2 3 4
C 2 3 4 5
C 3 4 5 6

Each one of these has a header. Which are shown by h1-h4. My data is not like this, it contains real text values.

I want to rearrange the data so it looks like this.

A 
   1 2 3 4
   2 3 4 5
   3 4 5 6
B
   1 2 3 4
   2 3 4 5
   3 4 5 6

C
   1 2 3 4
   2 3 4 5
   3 4 5 6

Or

 A 1 2 3 4
   2 3 4 5
   3 4 5 6

B  1 2 3 4
   2 3 4 5
   3 4 5 6

C  1 2 3 4
   2 3 4 5
   3 4 5 6

So basically I want it to group by the first header name which is h1. Any help would be appreciated thanks.

The following should work, it assumes your source data is space delimited (as you have shown), if it uses commas or tabs, you will need to change the delimiter I have used.

import csv

with open("input.csv", "r") as f_input, open("output.csv", "wb") as f_output:
    csv_input = csv.reader(f_input, delimiter=" ")
    csv_output = csv.writer(f_output)
    headers = next(csv_input)

    cur_row = ""
    for cols in csv_input:
        if cur_row != cols[0]:
            cur_row = cols[0]
            csv_output.writerow([cur_row])
        csv_output.writerow(cols[1:])

Giving you an output CSV file as follows:

A
1,2,3,4
2,3,4,5
3,4,5,6
B
1,2,3,4
2,3,4,5
3,4,5,6
C
1,2,3,4
2,3,4,5
3,4,5,6

Tested using Python 2.7

To add the headers for each group, change the first writerow line as follows:

csv_output.writerows([[cur_row], headers])

Giving the following output:

A
H1,h2,h3,h4
1,2,3,4
2,3,4,5
3,4,5,6
B
H1,h2,h3,h4
1,2,3,4
2,3,4,5
3,4,5,6
C
H1,h2,h3,h4
1,2,3,4
2,3,4,5
3,4,5,6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM