简体   繁体   中英

Best way to sort txt file using csv tools in python

I have the following code, and am attempting to sort the file contents in the simplest method possible.

import csv
import operator

#==========Search by ID number. Return Just the Name Fields for the Student
with open("studentinfo.txt","r") as f:
  studentfileReader=csv.reader(f)
  id=input("Enter Id:")
  for row in studentfileReader:
    for field in row:
      if field==id:
        currentindex=row.index(id)
        print(row[currentindex+1]+" "+row[currentindex+2])

#=========Sort by Last Name
with open("studentinfo.txt","r") as f:
  studentfileReader=csv.reader(f)
  sortedlist=sorted(f,key=operator.itemgetter(0),reverse=True)
  print(sortedlist)

I am aware of various possible solutions but cannot quite get them to function correctly, and would also, for teaching/learning purposes, be interested in the most simple and effective solution with a clear explanation.

Research included: ****import operator**** sortedlist = sorted(reader, key=operator.itemgetter(3), reverse=True)

or use of lambda sortedlist = sorted(reader, key=lambda row: row[3], reverse=True)

For the ANSWER, I would be grateful if someone could post a full working solution, showing sorting by LAST NAME, and by ID Number, to illustrate two different examples. An extension to the answer would be to show how to sort by multiple values in this specific example:

Full code listing:

https://repl.it/Jau3/3

Contents of File

002,Ash,Smith,Test1:20,Test2:20,Test3:100003
004,Grace,Asha,Test1:33,Test2:54,Test3:23
005,Cat,Zelch,Test1:66,Test2:22,Test3:11
001,Joe,Bloggs,Test1:99,Test2:100,Test3:1
003,Jonathan,Peter,Test1:99,Test2:33,Test3:44

You can use a lambda function to sort by any key of the list returned by csv reader, for example, by last name (third column):

with open("studentinfo.txt", "r") as f:
    reader = csv.reader(f)
    sorted_list = list(reader)  # turn the reader iterator into a list
    sorted_list.sort(key=lambda x: x[2])  # use the third column as a sorting key
    print("\n".join(str(row) for row in sorted_list))  # prettier print

or by ID (first column):

with open("studentinfo.txt", "r") as f:
    reader = csv.reader(f)
    sorted_list = list(reader)  # turn the reader iterator into a list
    sorted_list.sort(key=lambda x: x[0])  # the first column as a sorting key, can be omitted
    print("\n".join(str(row) for row in sorted_list))  # prettier print

Or by two keys:

with open("studentinfo.txt", "r") as f:
    reader = csv.reader(f)
    sorted_list = list(reader)  # turn the reader iterator into a list
    sorted_list.sort(key=lambda x: (x[3], x[4]))  # use fourth and fifth column
    print("\n".join(str(row) for row in sorted_list))  # prettier print

You can add reverse=True to the list.sort() calls for a descending sort.

ADDENUM - If you really don't want to use lambdas (why?), you can define an item-getter function (or just use operator.itemgetter which exists for this purpose) and pass that to the list.sort() call, eg:

def get_third_column(x):
    return x[2]

with open("studentinfo.txt", "r") as f:
    reader = csv.reader(f)
    sorted_list = list(reader)  # turn the reader iterator into a list
    sorted_list.sort(key=get_third_column)  # use the third column as a sorting key
    print("\n".join(str(row) for row in sorted_list))  # prettier print

A compact, simple solution to read -> sort -> write:

import csv
import operator

with open("input.csv") as fh:
    reader = csv.reader(fh)
    rows = sorted(reader, key=operator.itemgetter(0), reverse=True)

with open("output.csv", "w") as fh:
    csv.writer(fh).writerows(rows)

To print on the console instead of writing to a file, you can use sys.stdout as the file handle:

import sys

with sys.stdout as fh:
    csv.writer(fh).writerows(rows)

The operator.itemgetter(0) determines which field to sort by. The 0-th field is the id. To sort by last name, use operator.itemgetter(2) , because the last name is the 3rd column.

To sort by multiple fields, you will need to use a lambda, for example to sort by last and then first name:

    rows = sorted(reader, key=lambda x: (x[2], x[1]), reverse=True)

The code before the sorting, where you ask user input for an Id, could be improved too:

  • Iterating over each field is unnecessary when you know that the id field is the first
  • id shadows a built-in function in Python, so it's not recommended to use as variable

You could write like this:

with open("studentinfo.txt") as fh:
    reader = csv.reader(fh)
    student_id = input("Enter Id:")
    for row in reader:
        if row[0] == student_id:
            print(row[1] + " " + row[2])

Use import operator, as you have done, and one possible solution: Note - that you would ideally need a header to differentiate what you want to sort by (suppose the user wanted to specify this explicitly)

import csv
import operator
ifile =open('myfile.csv', 'rb')
infile = csv.reader(ifile)
# Note that if you have a header, this is the header line
infields = infile.next()
startindex = infields.index('Desired Header')
# Here you are creating the sorted list
sortedlist = sorted(infile, key=operator.itemgetter(startindex), reverse=True)
ifile.close
# open the output file - it can be the same as the input file
ofile = open('myoutput.csv, 'wb')
outfile.writerow(infields)
for row in sortedlist:
  outfile.writerow(row)
ofile.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM