简体   繁体   English

Python:IndexError:列表索引超出范围(从具有3列的CSV读取)

[英]Python: IndexError: list index out of range (reading from CSV with 3 columns)

I am working on creating a stacked bar graph drawn from data in a CSV file. 我正在创建一个从CSV文件中的数据绘制的堆积条形图。 The data looks like this: 数据如下所示:

ANC-088,333,148
ANC-089,153,86
ANC-090,138,75

There more rows just like this. 像这样有更多的行。

The beginning script I have, just to start playing with bar graphs, looks like this: 我刚开始使用条形图时所具有的开始脚本如下所示:

from pylab import *

name = []
totalwords = []
uniquewords = []

readFile = open('wordstats-legends.csv', 'r').read()
eachLine = readFile.split('\n')

for line in eachLine:
    split = line.split(',')
    name.append(split[0])
    totalwords.append(split[1])
    uniquewords.append(int(split[2]))

pos = arange(len(name)) + 0.5
bar(pos, totalwords, align = 'center', color='red')
xticks(pos, name)

When I decided to see how things were going, I get the following error: 当我决定看看情况如何时,出现以下错误:

---> 13     totalwords.append(split[1])
IndexError: list index out of range

What am I not seeing and what are my first steps in fixing this? 我看不到什么,解决此问题的第一步是什么? (Additional explanations most welcome as I continue to try to teach myself this stuff.) (在我继续尝试自学这些内容时,最欢迎提供其他解释。)

Evidently this is a problem with your .csv , one or more of your lines does not contain the desired data. 显然这是您的.csv问题,您的一个或多个行未包含所需的数据。 You can try to eliminate these lines as such: 您可以尝试消除这些行,例如:

eachLine = [item for item in readFile.split('\n') if len(item.split(',')) >= 3]

Like so: 像这样:

from pylab import *

name = []
totalwords = []
uniquewords = []

readFile = open('wordstats-legends.csv', 'r').read()
eachLine = [item for item in readFile.split('\n') if len(item.split(',')) >= 3]

for line in eachLine:
    split = line.split(',')
    name.append(split[0])
    totalwords.append(split[1])
    uniquewords.append(int(split[2]))

pos = arange(len(name)) + 0.5
bar(pos, totalwords, align = 'center', color='red')
xticks(pos, name)

If you are sure the whole file looks like you described, the problem will be the last newline (at the end of the file), where an empty string is inserted into eachLine (you split the lines at the newline character and after the last newline there is nothing). 如果您确定整个文件看起来像您描述的那样,那么问题将出在最后一个换行符(在文件末尾),其中在每个eachLine插入一个空字符串(您在换行符和最后一个换行符之后拆分行)什么也没有)。 So you only need to omit the last element in your eachline eg with eachLine.pop() after splitting. 因此,您只需要在分割后省略eachline的最后一个元素,例如,使用eachLine.pop()

If you would like a robust and general solution which takes care about every line that you can't split into three parts, you should use the solution from user1823. 如果您想要一个健壮且通用的解决方案来处理您不能分成三部分的每一行,则应使用user1823的解决方案。 However, if the problem really is only what I have described above, checking for condition with splitting can slow you down for larger files. 但是,如果问题仅是我上面描述的,则检查拆分条件可能会使您拖慢查找较大文件的速度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM