為什么在對字符串進行索引而不是切片時，我會在 Python 3 中收到 IndexError？

Question

我是編程新手，正在嘗試 Python 3。我發現了一些處理 IndexError 的主題，但似乎沒有一個對這種特定情況有幫助。

我編寫了一個函數，它打開一個文本文件，一次讀取一行，並將該行分成單獨的字符串，每個字符串都附加到一個特定的列表（記錄行中每個“列”一個列表）。 大多數切片是多個字符 [x:y]，但有些是單個字符 [x]。

我收到一個IndexError: string index out of range消息，據我所知，它不是。 這是函數：

def read_recipe_file():
    recipe_id = []
    recipe_book = []
    recipe_name = []
    recipe_page = []
    ingred_1 = []
    ingred_1_qty = []
    ingred_2 = []
    ingred_2_qty = []
    ingred_3 = []
    ingred_3_qty = []

    f = open('recipe-file.txt', 'r')  # open the file 
    for line in f:
        # slice out each component of the record line and store it in the appropriate list
        recipe_id.append(line[0:3])
        recipe_name.append(line[3:23])
        recipe_book.append(line[23:43])
        recipe_page.append(line[43:46])
        ingred_1.append(line[46]) 
        ingred_1_qty.append(line[47:50])
        ingred_2.append(line[50]) 
        ingred_2_qty.append(line[51:54])
        ingred_3.append(line[54]) 
        ingred_3_qty.append(line[55:])
    f.close()
return recipe_id, recipe_name, recipe_book, recipe_page, ingred_1, ingred_1_qty, ingred_2, ingred_2_qty, ingred_3, \
       ingred_3_qty

這是回溯：

Traceback (most recent call last):
  File "recipe-test.py", line 84, in <module>
    recipe_id, recipe_book, recipe_name, recipe_page, ingred_1, ingred_1_qty, ingred_2, ingred_2_qty, ingred_3, ingred_3_qty = read_recipe_file()
  File "recipe-test.py", line 27, in read_recipe_file
    ingred_1.append(line[46])

調用相關函數的代碼是：

print('To show list of recipes: 1')
print('To add a recipe: 2')
user_choice = input()
recipe_id, recipe_book, recipe_name, recipe_page, ingred_1, ingred_1_qty, ingred_2, ingred_2_qty, \
ingred_3, ingred_3_qty = read_recipe_file()

if int(user_choice) == 1:
    print_recipe_table(recipe_id, recipe_book, recipe_name, recipe_page, ingred_1, ingred_1_qty,
                    ingred_2, ingred_2_qty, ingred_3, ingred_3_qty)

elif int(user_choice) == 2:
    #code to add recipe

失敗的線路是這樣的：

ingred_1.append(line[46])

我試圖閱讀的文本文件的每一行中都有超過 46 個字符，所以我不明白為什么我會收到越界錯誤（示例行如下）。 如果我將代碼更改為：

ingred_1.append(line[46:])

要讀取切片而不是特定字符，該行正確執行，但程序在此行失敗：

ingred_2.append(line[50])

這讓我認為它與從字符串中附加單個字符而不是多個字符的切片有某種關系。

這是我正在閱讀的文本文件中的示例行：

001Cheese on Toast     Meals For Two       012120038005002

我可能應該補充一點，我很清楚這不是很好的代碼 - 有很多方法可以改進程序，但據我所知，代碼實際上應該可以工作。

Answer 1

如果文件中的某些行為空或至少很短，就會發生這種情況。 文件末尾的雜散換行符是一個常見原因，因為它會作為一個額外的空行出現。 調試此類案例的最佳方法是捕獲異常，並調查失敗的特定line （幾乎肯定不會是您復制的示例行）：

try:
    ingred_1.append(line[46])
except IndexError:
    print(line)
    print(len(line))

捕獲此異常通常也是處理錯誤的正確方法：您已檢測到病理情況，現在您可以考慮該怎么做。 例如，您可能會：

continue ，它將默默地跳過處理該行，
記錄一些東西然后continue
通過提出一個新的、更熱門的異常來raise ValueError("Line too short") ：例如raise ValueError("Line too short") 。

打印相關的事情，有或沒有持續的，幾乎總是一個好主意，如果這代表了與輸入文件有問題權證固定。 如果它是相對微不足道的，並且您知道不會在處理的其余部分中導致連續錯誤，那么靜默繼續是一個不錯的選擇。 您可能希望通過及早檢測“完全空”的情況來區分“太短”和“完全空”的情況，例如在循環的頂部執行此操作：

if not line:
    # Skip blank lines
    continue

並適當處理另一種情況的錯誤。

將其更改為切片的原因是因為字符串切片永遠不會失敗。 如果切片中的兩個索引都在字符串之外（在同一方向），您將得到一個空字符串 - 例如：

>>> 'abc'[4]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: string index out of range
>>> 'abc'[4:]
''
>>> 'abc'[4:7]
''

Answer 2

您的代碼在line[46] line失敗，因為該line包含的字符少於 47 個。 切片操作line[46:]仍然有效，因為超出范圍的字符串切片返回空字符串。

您可以通過替換來驗證線路是否太短

ingred_1.append(line[46])

和

try:
    ingred_1.append(line[46])
except IndexError:
    print('line = "%s", length = %d' % (line, len(line)))

為什么在對字符串進行索引而不是切片時，我會在 Python 3 中收到 IndexError？

問題描述

2 個解決方案

解決方案1
2 2015-09-19 12:16:31

解決方案2
0 2015-09-19 12:16:41

為什么在對字符串進行索引而不是切片時，我會在 Python 3 中收到 IndexError？

問題描述

2 個解決方案

解決方案1 2 2015-09-19 12:16:31

解決方案2 0 2015-09-19 12:16:41

解決方案1
2 2015-09-19 12:16:31

解決方案2
0 2015-09-19 12:16:41