简体   繁体   English

为什么在对字符串进行索引而不是切片时,我会在 Python 3 中收到 IndexError?

[英]Why am I getting an IndexError in Python 3 when indexing a string and not slicing?

I'm new to programming, and experimenting with Python 3. I've found a few topics which deal with IndexError but none that seem to help with this specific circumstance.我是编程新手,正在尝试 Python 3。我发现了一些处理 IndexError 的主题,但似乎没有一个对这种特定情况有帮助。

I've written a function which opens a text file, reads it one line at a time, and slices the line up into individual strings which are each appended to a particular list (one list per 'column' in the record line).我编写了一个函数,它打开一个文本文件,一次读取一行,并将该行分成单独的字符串,每个字符串都附加到一个特定的列表(记录行中每个“列”一个列表)。 Most of the slices are multiple characters [x:y] but some are single characters [x].大多数切片是多个字符 [x:y],但有些是单个字符 [x]。

I'm getting an IndexError: string index out of range message, when as far as I can tell, it isn't.我收到一个IndexError: string index out of range消息,据我所知,它不是。 This is the function:这是函数:

def read_recipe_file():
    recipe_id = []
    recipe_book = []
    recipe_name = []
    recipe_page = []
    ingred_1 = []
    ingred_1_qty = []
    ingred_2 = []
    ingred_2_qty = []
    ingred_3 = []
    ingred_3_qty = []

    f = open('recipe-file.txt', 'r')  # open the file 
    for line in f:
        # slice out each component of the record line and store it in the appropriate list
        recipe_id.append(line[0:3])
        recipe_name.append(line[3:23])
        recipe_book.append(line[23:43])
        recipe_page.append(line[43:46])
        ingred_1.append(line[46]) 
        ingred_1_qty.append(line[47:50])
        ingred_2.append(line[50]) 
        ingred_2_qty.append(line[51:54])
        ingred_3.append(line[54]) 
        ingred_3_qty.append(line[55:])
    f.close()
return recipe_id, recipe_name, recipe_book, recipe_page, ingred_1, ingred_1_qty, ingred_2, ingred_2_qty, ingred_3, \
       ingred_3_qty

This is the traceback:这是回溯:

Traceback (most recent call last):
  File "recipe-test.py", line 84, in <module>
    recipe_id, recipe_book, recipe_name, recipe_page, ingred_1, ingred_1_qty, ingred_2, ingred_2_qty, ingred_3, ingred_3_qty = read_recipe_file()
  File "recipe-test.py", line 27, in read_recipe_file
    ingred_1.append(line[46])

The code which calls the function in question is:调用相关函数的代码是:

print('To show list of recipes: 1')
print('To add a recipe: 2')
user_choice = input()
recipe_id, recipe_book, recipe_name, recipe_page, ingred_1, ingred_1_qty, ingred_2, ingred_2_qty, \
ingred_3, ingred_3_qty = read_recipe_file()

if int(user_choice) == 1:
    print_recipe_table(recipe_id, recipe_book, recipe_name, recipe_page, ingred_1, ingred_1_qty,
                    ingred_2, ingred_2_qty, ingred_3, ingred_3_qty)

elif int(user_choice) == 2:
    #code to add recipe

The failing line is this:失败的线路是这样的:

ingred_1.append(line[46])

There are more than 46 characters in each line of the text file I am trying to read, so I don't understand why I'm getting an out of bounds error (a sample line is below).我试图阅读的文本文件的每一行中都有超过 46 个字符,所以我不明白为什么我会收到越界错误(示例行如下)。 If I change to the code to this:如果我将代码更改为:

ingred_1.append(line[46:])

to read a slice, rather than a specific character, the line executes correctly, and the program fails on this line instead:要读取切片而不是特定字符,该行正确执行,但程序在此行失败:

ingred_2.append(line[50])

This leads me to think it is somehow related to appending a single character from the string, rather than a slice of multiple characters.这让我认为它与从字符串中附加单个字符而不是多个字符的切片有某种关系。

Here is a sample line from the text file I am reading:这是我正在阅读的文本文件中的示例行:

001Cheese on Toast     Meals For Two       012120038005002

I should probably add that I'm well aware this isn't great code overall - there are lots of ways I could generally improve the program, but as far as I can tell the code should actually work.我可能应该补充一点,我很清楚这不是很好的代码 - 有很多方法可以改进程序,但据我所知,代码实际上应该可以工作。

This will happen if some of the lines in the file are empty or at least short.如果文件中的某些行为空或至少很短,就会发生这种情况。 A stray newline at the end of the file is a common cause, since that comes up as an extra blank line.文件末尾的杂散换行符是一个常见原因,因为它会作为一个额外的空行出现。 The best way to debug a case like this is to catch the exception, and investigate the particular line that fails (which almost certainly won't be the sample line you reproduced):调试此类案例的最佳方法是捕获异常,并调查失败的特定line (几乎肯定不会是您复制的示例行):

try:
    ingred_1.append(line[46])
except IndexError:
    print(line)
    print(len(line))

Catching this exception is also usually the right way to deal with the error: you've detected a pathological case, and now you can consider what to do.捕获此异常通常也是处理错误的正确方法:您已检测到病理情况,现在您可以考虑该怎么做。 You might for example:例如,您可能会:

  • continue , which will silently skip processing that line, continue ,它将默默地跳过处理该行,
  • Log something and then continue记录一些东西然后continue
  • Bail out by raising a new, more topical exception: eg raise ValueError("Line too short") .通过提出一个新的、更热门的异常来raise ValueError("Line too short") :例如raise ValueError("Line too short")

Printing something relevant, with or without continuing, is almost always a good idea if this represents a problem with the input file that warrants fixing.打印相关的事情,有或没有持续的,几乎总是一个好主意,如果这代表了与输入文件有问题权证固定。 Continuing silently is a good option if it is something relatively trivial, that you know can't cause flow-on errors in the rest of your processing.如果它是相对微不足道的,并且您知道不会在处理的其余部分中导致连续错误,那么静默继续是一个不错的选择。 You may want to differentiate between the "too short" and "completely empty" cases by detecting the "completely empty" case early such as by doing this at the top of your loop:可能希望通过及早检测“完全空”的情况来区分“太短”和“完全空”的情况,例如在循环的顶部执行此操作:

if not line:
    # Skip blank lines
    continue

And handling the error for the other case appropriately.并适当处理另一种情况的错误。


The reason changing it to a slice works is because string slices never fail.将其更改为切片的原因是因为字符串切片永远不会失败。 If both indexes in the slice are outside the string (in the same direction), you will get an empty string - eg:如果切片中的两个索引都在字符串之外(在同一方向),您将得到一个空字符串 - 例如:

>>> 'abc'[4]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: string index out of range
>>> 'abc'[4:]
''
>>> 'abc'[4:7]
''

Your code fails on line[46] because line contains fewer than 47 characters.您的代码在line[46] line失败,因为该line包含的字符少于 47 个。 The slice operation line[46:] still works because an out-of-range string slice returns an empty string.切片操作line[46:]仍然有效,因为超出范围的字符串切片返回空字符串。

You can verify that the line is too short by replacing您可以通过替换来验证线路是否太短

ingred_1.append(line[46])

with

try:
    ingred_1.append(line[46])
except IndexError:
    print('line = "%s", length = %d' % (line, len(line)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM