Pandas Dataframe iterating over rows

Question

Here is the dataset

import pandas as pd
d = {'Key':['A','A','A','A'],'Rank':[1,2,3,4],'col1': [15000,12000,6000,7000], 'col2': [15000,10000,0,0],'col4': [10000,10000,10000,10000],'col5': [0,0,0,0] }
df = pd.DataFrame(data=d)
df

Col1= Max values it can take
Col2=Current value it holds
Col4:Remaining value that should fit in any of these records.

I am trying to fill in the 'col5' with possible max value that it can take.Where 'Col1' defines its maximum limit and 'col2' shows its current value. If it fits max value then move to the next row. The value that it can fit is determined by 'col4'. Please see below example.

Example:

first record with rank 1 Col1=15000 and col2=15000 then move to next row.
second record with rank2 col1=12000 and col2=10000. Here we can see that its max is 12000 so I can add 2000 more, also need to make sure col5>2000 so col5=2000 and col4 for next record will be 10000-2000=8000

Here is the end dataset which should look like

Below is the code which I have tried

for index, row in df.iterrows():
    #print(row['col1'], row['col2'])
    if row['col1']>row['col2']:
        
        if (row['col1']-row['col2'])<row['col2']:
            row['col5']=row['col1']-row['col2']
        else:
            row['col5']=row['col2']
    #return
    print(row['col1'], row['col2'],row['col5'])

Answer 1

this should do your stuff (Updated with multiple keys):

import pandas as pd

d = {'Key': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'], 'Rank': [1, 2, 3, 4, 1, 2, 3, 4],
 'col1': [15000, 12000, 6000, 7000, 15000, 12000, 6000, 7000], 'col2': [15000, 10000, 0, 0, 15000, 10000, 0, 0],
 'col4': [10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000], 'col5': [0, 0, 0, 0, 0, 0, 0, 0]}
df = pd.DataFrame(data=d)
print(df)

df_result = pd.DataFrame()

for group in df.groupby(df.Key):
    tmp_value = 0
    df_tmp = group[1]
    for index, row in df_tmp.iterrows():
        if tmp_value == 0:
            tmp_value = row['col4']
        # print(row['col1'], row['col2'])
        if row['col1'] > row['col2']:
            diff_value = row['col1'] - row['col2']
            if diff_value < tmp_value:
                df_tmp.at[index, 'col5'] = row['col1'] - row['col2']
                tmp_value = tmp_value - diff_value
            else:
                df_tmp.at[index, 'col5'] = tmp_value
                break
    df_result = df_result.append(df_tmp)

print(df_result)

A few hints:
The tmp_value holds the data from col 4 to decrease over time.
you should break with break , not with exit in my mind
Here you can read about editing panda rows during iterating over it: Update a dataframe in pandas while iterating row by row .
edit: You also can get the key data first and save the 'col4'-data in an array and change the original dataframe directly, but thats up to you

Pandas Dataframe iterating over rows

Question

1 answers

solution1
1 ACCPTED 2020-09-25 15:47:11

Pandas Dataframe iterating over rows

Question

1 answers

solution1 1 ACCPTED 2020-09-25 15:47:11

solution1
1 ACCPTED 2020-09-25 15:47:11