Here is the dataset
import pandas as pd
d = {'Key':['A','A','A','A'],'Rank':[1,2,3,4],'col1': [15000,12000,6000,7000], 'col2': [15000,10000,0,0],'col4': [10000,10000,10000,10000],'col5': [0,0,0,0] }
df = pd.DataFrame(data=d)
df
I am trying to fill in the 'col5' with possible max value that it can take.Where 'Col1' defines its maximum limit and 'col2' shows its current value. If it fits max value then move to the next row. The value that it can fit is determined by 'col4'. Please see below example.
Example:
Here is the end dataset which should look like
Below is the code which I have tried
for index, row in df.iterrows():
#print(row['col1'], row['col2'])
if row['col1']>row['col2']:
if (row['col1']-row['col2'])<row['col2']:
row['col5']=row['col1']-row['col2']
else:
row['col5']=row['col2']
#return
print(row['col1'], row['col2'],row['col5'])
this should do your stuff (Updated with multiple keys):
import pandas as pd
d = {'Key': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'], 'Rank': [1, 2, 3, 4, 1, 2, 3, 4],
'col1': [15000, 12000, 6000, 7000, 15000, 12000, 6000, 7000], 'col2': [15000, 10000, 0, 0, 15000, 10000, 0, 0],
'col4': [10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000], 'col5': [0, 0, 0, 0, 0, 0, 0, 0]}
df = pd.DataFrame(data=d)
print(df)
df_result = pd.DataFrame()
for group in df.groupby(df.Key):
tmp_value = 0
df_tmp = group[1]
for index, row in df_tmp.iterrows():
if tmp_value == 0:
tmp_value = row['col4']
# print(row['col1'], row['col2'])
if row['col1'] > row['col2']:
diff_value = row['col1'] - row['col2']
if diff_value < tmp_value:
df_tmp.at[index, 'col5'] = row['col1'] - row['col2']
tmp_value = tmp_value - diff_value
else:
df_tmp.at[index, 'col5'] = tmp_value
break
df_result = df_result.append(df_tmp)
print(df_result)
A few hints:
The tmp_value holds the data from col 4 to decrease over time.
you should break with break
, not with exit in my mind
Here you can read about editing panda rows during iterating over it: Update a dataframe in pandas while iterating row by row .
edit: You also can get the key data first and save the 'col4'-data in an array and change the original dataframe directly, but thats up to you
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.