简体   繁体   中英

how to sort data in csv file using python using particular column

I'm reading data from csv file and trying to sort data by using particular column for example reading data to 100 students from csv file and have to sort data according to marks

import csv
import operator

with open('Student_Records.csv', 'r') as csvFile:
    reader = csv.reader(csvFile)
    for row in reader:
        print(row)
sortedlist = sorted(reader, key=operator.itemgetter(7), reverse=True)

for eachline in sortedlist:
    print(eachline)

csvFile.close()

csv file in excel sheet and that file doesn't have column names, following is the csv file data

1,Lois,Walker,F,lois.walker@hotmail.com,Donald Walker,Helen Walker,40,303-572-8492
2,Brenda,Robinson,F,brenda.robinson@gmail.com,Raymond Robinson,Judy Robinson,80,225-945-4954
3,Joe,Robinson,M,joe.robinson@gmail.com,Scott Robinson,Stephanie Robinson,70,219-904-2161
4,Diane,Evans,F,diane.evans@yahoo.com,Jason Evans,Michelle Evans,90,215-793-6791
5,Benjamin,Russell,M,benjamin.russell@charter.net,Gregory Russell,Elizabeth Russell,56,262-404-2252
6,Patrick,Bailey,M,patrick.bailey@aol.com,Ralph Bailey,Laura Bailey,36,319-812-6957
7,Nancy,Baker,F,nancy.baker@bp.com,Scott Baker,Judy Baker,78,229-336-5117

Below should work for you, I created a list of rows after reading the csv such that the marks are actually integers, instead of strings when they are read from the csv

Also I am assuming multiple whitespaces in csv, so I have used a whitespace delimiter so itemgetter index is chosen as 9, which might be different based on how your csv looks like

import csv
import operator

li = []

#Open csv file
with open('file.csv', 'r') as csvFile:
    reader = csv.reader(csvFile, delimiter=' ', skipinitialspace=True )

    #Create a list of all rows such that the marks column is an integer
    for item in reader:
        #Save marks value as an integer, leave other values as is
        l = [int(value) if idx == 9 else value for idx, value in enumerate(item)]
        li.append(l)

#Sort on that item
print(sorted(li, key=operator.itemgetter(9), reverse=True))

My csv looks like:

1   Lois    Walker  F   lois.walker@hotmail.com Donald Walker   Helen Walker    40  303-572-8492
2   Brenda  Robinson    F   brenda.robinson@gmail.com   Raymond Robinson    Judy Robinson   80  225-945-4954
3   Joe Robinson    M   joe.robinson@gmail.com  Scott Robinson  Stephanie Robinson  70  219-904-2161
4   Diane   Evans   F   diane.evans@yahoo.com   Jason Evans Michelle Evans  90  215-793-6791
5   Benjamin    Russell M   benjamin.russell@charter.net    Gregory Russell Elizabeth Russell   56  262-404-2252

The output will look like

[['4', 'Diane', 'Evans', 'F', 'diane.evans@yahoo.com', 'Jason', 'Evans', 'Michelle', 'Evans', 90, '215-793-6791'], 
['2', 'Brenda', 'Robinson', 'F', 'brenda.robinson@gmail.com', 'Raymond', 'Robinson', 'Judy', 'Robinson', 80, '225-945-4954'], 
['3', 'Joe', 'Robinson', 'M', 'joe.robinson@gmail.com', 'Scott', 'Robinson', 'Stephanie', 'Robinson', 70, '219-904-2161'], 
['5', 'Benjamin', 'Russell', 'M', 'benjamin.russell@charter.net', 'Gregory', 'Russell', 'Elizabeth', 'Russell', 56, '262-404-2252'], 
['1', 'Lois', 'Walker', 'F', 'lois.walker@hotmail.com', 'Donald', 'Walker', 'Helen', 'Walker', 40, '303-572-8492']]

Try Pandas,

df = pd.read_csv("your_file", sep='xx', 
              names = ["x", "y", "z", "marks"])

df.sort_values('marks')

print(df)

You could try

import csv
with open('input.csv', newline='') as csvfile:
    rdr = csv.reader(csvfile)
    l = sorted(rdr, key=lambda x: x[6], reverse=True)

csv.reader() is used to create a reader object which is sorted using sorted() with reverse=True for descending order sort to obtain a list.

This list can be used to write out an output csv using something like

with open('output.csv', 'w') as csvout:
    wrtr = csv.writer(csvout)
    wrtr.writerows(l)

The output csv file would be something like

4,Diane   Evans,F,diane.evans@yahoo.com,Jason Evans,Michelle Evans,90,215-793-6791
2,Brenda  Robinson,F,brenda.robinson@gmail.com,Raymond Robinson,Judy Robinson,80,225-945-4954
3,Joe Robinson,M,joe.robinson@gmail.com,Scott Robinson,Stephanie Robinson,70,219-904-2161
5,Benjamin    Russell,M,benjamin.russell@charter.net,Gregory Russell,Elizabeth Russell,56,262-404-2252
1,Lois  Walker,F,lois.walker@hotmail.com,Donald Walker,Helen Walker,40,303-572-8492

Since you are reading the data from a file object, specify the newline parameter as '' to be safe.

As the docs say:

If csvfile is a file object, it should be opened with newline=''.

From docs :

If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \\r\\n linendings on write an extra \\r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM