简体   繁体   中英

Python For Loop with Enumerate: 'String Index Out Of Range'

I have been trying to write a for loop in Python, where the goal is to iterate through a list of alphabetically-sorted letters and a corresponding list of numbers, and perform a cumulative calculation:

  1. If a letter in the first list is different than the previous letter (ie it is the first time that that letter appears in the list), divide the corresponding number from the second list by 0.5 and then append it to a new list.
  2. Until the letter in the first list changes, take the previous value from the new list, multiply it by 0.5, and then add the current item from the number list and append the result to the new list.

This example shows how this would work in Excel: MS Excel calculation

At first, I thought that I could do something like what was outlined here , using a while loop and having it reference previous values, but then this seemed like it wouldn't work for this purpose since there is already an existing list (the letters) for which I would need to reference previous values (rather than just referencing previous values in a new list).

I then tried to use a for loop with enumerate, but was getting a 'string index out of range' error:

import numpy as np
import pandas as pd

d={'Letters':['A','A','A','B','B','B'],'Numbers':[1,2,3,4,5,6]}

df=pd.DataFrame(data=d)

Number=df.Numbers
Letter=df.Letters

NumberOutput=[]
for index,(x,y) in enumerate(zip(Number,Letter)):
    if index==0:
        NumberOutput.append(x/0.5)
    elif index>0 and y!=y[index-1]:
        NumberOutput.append(x/0.5)
    else:
        NumberOutput.append((NumberOutput[index-1]*0.5)+x)

I am assuming that the problem here is that the for loop is trying to reference the previous Letter value at position 0 in the list, ie the previous string index doesn't exist, but the loop explicitly handles the case of index==0 before trying to reference index-1, so I'm not clear on why this causes this error.

I ended up going a different route and using two separate for loops, one that creates a list of 'LetterFlags' (that flag a 1 anytime the letter in the first list changes and a 0 otherwise), and then a second one that uses the 'LetterFlag' to determine whether or not the current Letter is different from the previous one. This approach doesn't throw an error and produces correct results, but this seems like it isn't the most efficient way to do this:

import numpy as np
import pandas as pd

d={'Letters':['A','A','A','B','B','B'],'Numbers':[1,2,3,4,5,6]}

df=pd.DataFrame(data=d)

LetterFlag=[]
Number=df.Numbers
Letter=df.Letters

for (previousL,currentL) in zip(Letter,Letter[1:]):
    if previousL!=currentL:
        LetterFlag.append(1)
    else:
        LetterFlag.append(0)
    
LetterFlag=[1]+LetterFlag
df['LetterFlag']=LetterFlag
LetterFlag=df.LetterFlag

NumberOutput=[]
for index,(x,y) in enumerate(zip(LetterFlag,Number)):
    if x==1:
        NumberOutput.append(y/0.5)
    else:
        NumberOutput.append((NumberOutput[index-1]*0.5)+y)

Is there a better way that I could have done this? Thank you in advance for any guidance.

If I understand the question correctly, you could do this:-

letters = ['A', 'A', 'A', 'B', 'B', 'B']
numbers = [1, 2, 3, 4, 5, 6]
assert len(letters) == len(numbers)

numout = [numbers[0]*0.5]

for i, L in enumerate(letters[1:], 1):
    if L != letters[i-1]:
        numout.append(numbers[i]*0.5)
        
print(*numout)

With these values the output will be:- 0.5 2.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM