简体   繁体   中英

Why am I getting an IndexError in Python 3 when indexing a string and not slicing?

I'm new to programming, and experimenting with Python 3. I've found a few topics which deal with IndexError but none that seem to help with this specific circumstance.

I've written a function which opens a text file, reads it one line at a time, and slices the line up into individual strings which are each appended to a particular list (one list per 'column' in the record line). Most of the slices are multiple characters [x:y] but some are single characters [x].

I'm getting an IndexError: string index out of range message, when as far as I can tell, it isn't. This is the function:

def read_recipe_file():
    recipe_id = []
    recipe_book = []
    recipe_name = []
    recipe_page = []
    ingred_1 = []
    ingred_1_qty = []
    ingred_2 = []
    ingred_2_qty = []
    ingred_3 = []
    ingred_3_qty = []

    f = open('recipe-file.txt', 'r')  # open the file 
    for line in f:
        # slice out each component of the record line and store it in the appropriate list
        recipe_id.append(line[0:3])
        recipe_name.append(line[3:23])
        recipe_book.append(line[23:43])
        recipe_page.append(line[43:46])
        ingred_1.append(line[46]) 
        ingred_1_qty.append(line[47:50])
        ingred_2.append(line[50]) 
        ingred_2_qty.append(line[51:54])
        ingred_3.append(line[54]) 
        ingred_3_qty.append(line[55:])
    f.close()
return recipe_id, recipe_name, recipe_book, recipe_page, ingred_1, ingred_1_qty, ingred_2, ingred_2_qty, ingred_3, \
       ingred_3_qty

This is the traceback:

Traceback (most recent call last):
  File "recipe-test.py", line 84, in <module>
    recipe_id, recipe_book, recipe_name, recipe_page, ingred_1, ingred_1_qty, ingred_2, ingred_2_qty, ingred_3, ingred_3_qty = read_recipe_file()
  File "recipe-test.py", line 27, in read_recipe_file
    ingred_1.append(line[46])

The code which calls the function in question is:

print('To show list of recipes: 1')
print('To add a recipe: 2')
user_choice = input()
recipe_id, recipe_book, recipe_name, recipe_page, ingred_1, ingred_1_qty, ingred_2, ingred_2_qty, \
ingred_3, ingred_3_qty = read_recipe_file()

if int(user_choice) == 1:
    print_recipe_table(recipe_id, recipe_book, recipe_name, recipe_page, ingred_1, ingred_1_qty,
                    ingred_2, ingred_2_qty, ingred_3, ingred_3_qty)

elif int(user_choice) == 2:
    #code to add recipe

The failing line is this:

ingred_1.append(line[46])

There are more than 46 characters in each line of the text file I am trying to read, so I don't understand why I'm getting an out of bounds error (a sample line is below). If I change to the code to this:

ingred_1.append(line[46:])

to read a slice, rather than a specific character, the line executes correctly, and the program fails on this line instead:

ingred_2.append(line[50])

This leads me to think it is somehow related to appending a single character from the string, rather than a slice of multiple characters.

Here is a sample line from the text file I am reading:

001Cheese on Toast     Meals For Two       012120038005002

I should probably add that I'm well aware this isn't great code overall - there are lots of ways I could generally improve the program, but as far as I can tell the code should actually work.

This will happen if some of the lines in the file are empty or at least short. A stray newline at the end of the file is a common cause, since that comes up as an extra blank line. The best way to debug a case like this is to catch the exception, and investigate the particular line that fails (which almost certainly won't be the sample line you reproduced):

try:
    ingred_1.append(line[46])
except IndexError:
    print(line)
    print(len(line))

Catching this exception is also usually the right way to deal with the error: you've detected a pathological case, and now you can consider what to do. You might for example:

  • continue , which will silently skip processing that line,
  • Log something and then continue
  • Bail out by raising a new, more topical exception: eg raise ValueError("Line too short") .

Printing something relevant, with or without continuing, is almost always a good idea if this represents a problem with the input file that warrants fixing. Continuing silently is a good option if it is something relatively trivial, that you know can't cause flow-on errors in the rest of your processing. You may want to differentiate between the "too short" and "completely empty" cases by detecting the "completely empty" case early such as by doing this at the top of your loop:

if not line:
    # Skip blank lines
    continue

And handling the error for the other case appropriately.


The reason changing it to a slice works is because string slices never fail. If both indexes in the slice are outside the string (in the same direction), you will get an empty string - eg:

>>> 'abc'[4]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: string index out of range
>>> 'abc'[4:]
''
>>> 'abc'[4:7]
''

Your code fails on line[46] because line contains fewer than 47 characters. The slice operation line[46:] still works because an out-of-range string slice returns an empty string.

You can verify that the line is too short by replacing

ingred_1.append(line[46])

with

try:
    ingred_1.append(line[46])
except IndexError:
    print('line = "%s", length = %d' % (line, len(line)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM