为什么在对字符串进行索引而不是切片时，我会在 Python 3 中收到 IndexError？

Question

我是编程新手，正在尝试 Python 3。我发现了一些处理 IndexError 的主题，但似乎没有一个对这种特定情况有帮助。

我编写了一个函数，它打开一个文本文件，一次读取一行，并将该行分成单独的字符串，每个字符串都附加到一个特定的列表（记录行中每个“列”一个列表）。 大多数切片是多个字符 [x:y]，但有些是单个字符 [x]。

我收到一个IndexError: string index out of range消息，据我所知，它不是。 这是函数：

def read_recipe_file():
    recipe_id = []
    recipe_book = []
    recipe_name = []
    recipe_page = []
    ingred_1 = []
    ingred_1_qty = []
    ingred_2 = []
    ingred_2_qty = []
    ingred_3 = []
    ingred_3_qty = []

    f = open('recipe-file.txt', 'r')  # open the file 
    for line in f:
        # slice out each component of the record line and store it in the appropriate list
        recipe_id.append(line[0:3])
        recipe_name.append(line[3:23])
        recipe_book.append(line[23:43])
        recipe_page.append(line[43:46])
        ingred_1.append(line[46]) 
        ingred_1_qty.append(line[47:50])
        ingred_2.append(line[50]) 
        ingred_2_qty.append(line[51:54])
        ingred_3.append(line[54]) 
        ingred_3_qty.append(line[55:])
    f.close()
return recipe_id, recipe_name, recipe_book, recipe_page, ingred_1, ingred_1_qty, ingred_2, ingred_2_qty, ingred_3, \
       ingred_3_qty

这是回溯：

Traceback (most recent call last):
  File "recipe-test.py", line 84, in <module>
    recipe_id, recipe_book, recipe_name, recipe_page, ingred_1, ingred_1_qty, ingred_2, ingred_2_qty, ingred_3, ingred_3_qty = read_recipe_file()
  File "recipe-test.py", line 27, in read_recipe_file
    ingred_1.append(line[46])

调用相关函数的代码是：

print('To show list of recipes: 1')
print('To add a recipe: 2')
user_choice = input()
recipe_id, recipe_book, recipe_name, recipe_page, ingred_1, ingred_1_qty, ingred_2, ingred_2_qty, \
ingred_3, ingred_3_qty = read_recipe_file()

if int(user_choice) == 1:
    print_recipe_table(recipe_id, recipe_book, recipe_name, recipe_page, ingred_1, ingred_1_qty,
                    ingred_2, ingred_2_qty, ingred_3, ingred_3_qty)

elif int(user_choice) == 2:
    #code to add recipe

失败的线路是这样的：

ingred_1.append(line[46])

我试图阅读的文本文件的每一行中都有超过 46 个字符，所以我不明白为什么我会收到越界错误（示例行如下）。 如果我将代码更改为：

ingred_1.append(line[46:])

要读取切片而不是特定字符，该行正确执行，但程序在此行失败：

ingred_2.append(line[50])

这让我认为它与从字符串中附加单个字符而不是多个字符的切片有某种关系。

这是我正在阅读的文本文件中的示例行：

001Cheese on Toast     Meals For Two       012120038005002

我可能应该补充一点，我很清楚这不是很好的代码 - 有很多方法可以改进程序，但据我所知，代码实际上应该可以工作。

Answer 1

如果文件中的某些行为空或至少很短，就会发生这种情况。 文件末尾的杂散换行符是一个常见原因，因为它会作为一个额外的空行出现。 调试此类案例的最佳方法是捕获异常，并调查失败的特定line （几乎肯定不会是您复制的示例行）：

try:
    ingred_1.append(line[46])
except IndexError:
    print(line)
    print(len(line))

捕获此异常通常也是处理错误的正确方法：您已检测到病理情况，现在您可以考虑该怎么做。 例如，您可能会：

continue ，它将默默地跳过处理该行，
记录一些东西然后continue
通过提出一个新的、更热门的异常来raise ValueError("Line too short") ：例如raise ValueError("Line too short") 。

打印相关的事情，有或没有持续的，几乎总是一个好主意，如果这代表了与输入文件有问题权证固定。 如果它是相对微不足道的，并且您知道不会在处理的其余部分中导致连续错误，那么静默继续是一个不错的选择。 您可能希望通过及早检测“完全空”的情况来区分“太短”和“完全空”的情况，例如在循环的顶部执行此操作：

if not line:
    # Skip blank lines
    continue

并适当处理另一种情况的错误。

将其更改为切片的原因是因为字符串切片永远不会失败。 如果切片中的两个索引都在字符串之外（在同一方向），您将得到一个空字符串 - 例如：

>>> 'abc'[4]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: string index out of range
>>> 'abc'[4:]
''
>>> 'abc'[4:7]
''

Answer 2

您的代码在line[46] line失败，因为该line包含的字符少于 47 个。 切片操作line[46:]仍然有效，因为超出范围的字符串切片返回空字符串。

您可以通过替换来验证线路是否太短

ingred_1.append(line[46])

和

try:
    ingred_1.append(line[46])
except IndexError:
    print('line = "%s", length = %d' % (line, len(line)))

为什么在对字符串进行索引而不是切片时，我会在 Python 3 中收到 IndexError？

问题描述

2 个解决方案

解决方案1
2 2015-09-19 12:16:31

解决方案2
0 2015-09-19 12:16:41

为什么在对字符串进行索引而不是切片时，我会在 Python 3 中收到 IndexError？

问题描述

2 个解决方案

解决方案1 2 2015-09-19 12:16:31

解决方案2 0 2015-09-19 12:16:41

解决方案1
2 2015-09-19 12:16:31

解决方案2
0 2015-09-19 12:16:41