简体   繁体   中英

string index out of range when sorting txt file in python

I have a txt file with 2 column, first column is the car name, second column is gallon used per hour, I'm trying to sort it in descending order with the second column value with code below:

import operator
car = open('Mileage.txt', 'r')
car_content = car.read()
sorted_content = sorted(car_content, key = operator.itemgetter(1), reverse=True)
print(car_content)
car.close()

I receive an error 'sorted_content = sorted(car_content, key = operator.itemgetter(1), reverse=True)

IndexError: string index out of range'

If I change the

key = operator.itemgetter(0)

It works but only prints the file without descending it.

file link: https://drive.google.com/file/d/1HW7zhGKVTHYLs4SrdQ1XMrc3k01BA3nT/view?usp=sharing

How can I fix it?

Let's review how the operator.itemgetter() works - Say you have got a list of tuple like this -

list1 = [(1,2,3),
        (4,5,6)]

If I select operator.itemgetter(0). That means I want the 1st value from the tuple. This function can be mapped to a list via -

#map
print(list(map(operator.itemgetter(0), list1))) #
#list comprehension
print([operator.itemgetter(1)(val) for val in list1])

The 1st one will print - # [1,4] The 2nd one will print - # [2,5]

Some suggestion on file reading -

Use context manager to open the file. It'll automatically close the file after reading. The lines from the file will contain the '\n'(A newline character). That you may wanna strip off.

with open('Mileage.txt', 'r') as car:
    car_content = car.read().splitlines() 

When you read the file content like this. List car_content will contain the list of strings -

['Prius,2.1', 'Camry,4.1', 'Sebring,4.2', 'Mustang,5.3 ', 'Accord,4.1', 'Camry,3.8', 'Camry,3.9', 'Mustang,5.2', 'Accord,4.3', 'Prius,2.3', 'Camry,4.2', 'Accord,4.4']

operator.itemgetter(1) will not work on the above list as every item in the list contains 1 single string separated via ',' and that's why you're getting the error list index out of range.

Now, what you need to do is to split this list on ',' -

car_content = [tuple(car.split(',')) for car in car_content]

This will give you the list of tuples -

[('Prius', '2.1'),
('Camry', '4.1'),
('Sebring', '4.2'),
('Mustang', '5.3 '),
('Accord', '4.1'),
('Camry', '3.8'),
('Camry', '3.9'),
('Mustang', '5.2'),
('Accord', '4.3'),
('Prius', '2.3'),
('Camry', '4.2'),
('Accord', '4.4')]

You can use the sorted function now with either 0 or 1. Here's the complete code-

import operator
with open('test.txt', 'r') as car:
    car_content = car.read().splitlines()  
car_content = [tuple(car.split(',')) for car in car_content]
sorted_content = sorted(car_content, key = operator.itemgetter(1), reverse=True)
print(sorted_content)

With output -

[('Mustang', '5.3 '),
('Mustang', '5.2'),
('Accord', '4.4'),
('Accord', '4.3'),
('Sebring', '4.2'),
('Camry', '4.2'),
('Camry', '4.1'),
('Accord', '4.1'),
('Camry', '3.9'),
('Camry', '3.8'),
('Prius', '2.3'),
('Prius', '2.1')]

You first need to format your data in lines and columns, here you are just reading the file as one string. Your file data structure is CSV (Comma Separated Values), you should read it line by line and then split each line at comma:

with open("Mileage.txt.txt", "r") as f:
  data = f.readlines()

data = [line.strip().split(",") for line in data]
data = [(line[0], float(line[1])) for line in data]

You can then sort the list of tuples:

data.sort(key=lambda item: item[1], reverse=True)

I recommend you to read the doc for strip , split , open and readlines as well as printing the data between each operation to understand the process.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM