简体   繁体   English

在 python 中对 txt 文件进行排序时,字符串索引超出范围

[英]string index out of range when sorting txt file in python

I have a txt file with 2 column, first column is the car name, second column is gallon used per hour, I'm trying to sort it in descending order with the second column value with code below:我有一个包含 2 列的 txt 文件,第一列是汽车名称,第二列是每小时使用的加仑,我正在尝试使用第二列值按降序对其进行排序,代码如下:

import operator
car = open('Mileage.txt', 'r')
car_content = car.read()
sorted_content = sorted(car_content, key = operator.itemgetter(1), reverse=True)
print(car_content)
car.close()

I receive an error 'sorted_content = sorted(car_content, key = operator.itemgetter(1), reverse=True)我收到错误'sorted_content = sorted(car_content, key = operator.itemgetter(1), reverse=True)

IndexError: string index out of range' IndexError:字符串索引超出范围'

If I change the如果我改变

key = operator.itemgetter(0)

It works but only prints the file without descending it.它可以工作,但只打印文件而不降序。

file link: https://drive.google.com/file/d/1HW7zhGKVTHYLs4SrdQ1XMrc3k01BA3nT/view?usp=sharing文件链接: https://drive.google.com/file/d/1HW7zhGKVTHYLs4SrdQ1XMrc3k01BA3nT/view?usp=sharing

How can I fix it?我该如何解决?

Let's review how the operator.itemgetter() works - Say you have got a list of tuple like this -让我们回顾一下 operator.itemgetter() 是如何工作的——假设你有一个这样的元组列表——

list1 = [(1,2,3),
        (4,5,6)]

If I select operator.itemgetter(0).如果我 select operator.itemgetter(0)。 That means I want the 1st value from the tuple.这意味着我想要元组中的第一个值。 This function can be mapped to a list via -这个 function 可以通过以下方式映射到列表 -

#map
print(list(map(operator.itemgetter(0), list1))) #
#list comprehension
print([operator.itemgetter(1)(val) for val in list1])

The 1st one will print - # [1,4] The 2nd one will print - # [2,5]第一个将打印 - # [1,4] 第二个将打印 - # [2,5]

Some suggestion on file reading -关于文件阅读的一些建议 -

Use context manager to open the file.使用上下文管理器打开文件。 It'll automatically close the file after reading.它会在读取后自动关闭文件。 The lines from the file will contain the '\n'(A newline character).文件中的行将包含“\n”(换行符)。 That you may wanna strip off.你可能想脱掉。

with open('Mileage.txt', 'r') as car:
    car_content = car.read().splitlines() 

When you read the file content like this.当您像这样读取文件内容时。 List car_content will contain the list of strings -列表 car_content 将包含字符串列表 -

['Prius,2.1', 'Camry,4.1', 'Sebring,4.2', 'Mustang,5.3 ', 'Accord,4.1', 'Camry,3.8', 'Camry,3.9', 'Mustang,5.2', 'Accord,4.3', 'Prius,2.3', 'Camry,4.2', 'Accord,4.4']

operator.itemgetter(1) will not work on the above list as every item in the list contains 1 single string separated via ',' and that's why you're getting the error list index out of range. operator.itemgetter(1) 不适用于上述列表,因为列表中的每个项目都包含 1 个通过 ',' 分隔的单个字符串,这就是错误列表索引超出范围的原因。

Now, what you need to do is to split this list on ',' -现在,您需要做的是将这个列表拆分为 ',' -

car_content = [tuple(car.split(',')) for car in car_content]

This will give you the list of tuples -这将为您提供元组列表 -

[('Prius', '2.1'),
('Camry', '4.1'),
('Sebring', '4.2'),
('Mustang', '5.3 '),
('Accord', '4.1'),
('Camry', '3.8'),
('Camry', '3.9'),
('Mustang', '5.2'),
('Accord', '4.3'),
('Prius', '2.3'),
('Camry', '4.2'),
('Accord', '4.4')]

You can use the sorted function now with either 0 or 1. Here's the complete code-您现在可以使用 0 或 1 排序的 function。这是完整的代码 -

import operator
with open('test.txt', 'r') as car:
    car_content = car.read().splitlines()  
car_content = [tuple(car.split(',')) for car in car_content]
sorted_content = sorted(car_content, key = operator.itemgetter(1), reverse=True)
print(sorted_content)

With output -与 output -

[('Mustang', '5.3 '),
('Mustang', '5.2'),
('Accord', '4.4'),
('Accord', '4.3'),
('Sebring', '4.2'),
('Camry', '4.2'),
('Camry', '4.1'),
('Accord', '4.1'),
('Camry', '3.9'),
('Camry', '3.8'),
('Prius', '2.3'),
('Prius', '2.1')]

You first need to format your data in lines and columns, here you are just reading the file as one string.您首先需要将数据格式化为行和列,在这里您只是将文件作为一个字符串读取。 Your file data structure is CSV (Comma Separated Values), you should read it line by line and then split each line at comma:您的文件数据结构是 CSV(逗号分隔值),您应该逐行读取它,然后以逗号分隔每一行:

with open("Mileage.txt.txt", "r") as f:
  data = f.readlines()

data = [line.strip().split(",") for line in data]
data = [(line[0], float(line[1])) for line in data]

You can then sort the list of tuples:然后,您可以对元组列表进行排序:

data.sort(key=lambda item: item[1], reverse=True)

I recommend you to read the doc for strip , split , open and readlines as well as printing the data between each operation to understand the process.我建议您阅读有关stripsplitopenreadlines的文档,并在每个操作之间打印数据以了解该过程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM