简体   繁体   中英

Using Python Update the maximum value in each row dataframe with the sum of [column with maximum value] and [column name threshold]

Day US  INDIA   JAPAN   GERMANY AUSTRALIA Threshold
11  40  30      20      100     110         5
21  60  70      80      55      57          8
32  12  43      57      87      98          9
41  99  23      45      65      78          12

This is the demo data frame, Here i wanted to choose maximum for each row from 3 countries(INDIA,GERMANY,US) and then add the threshold value to that maximum record and then add that into the max value and update it in the dataframe. lets take an example:

max[US,INDIA,GERMANY] = max[US,INDIA,GERMANY] + threshold

After performing this dataframe will get updated as below:

Day US  INDIA   JAPAN   GERMANY AUSTRALIA Threshold
11  40  30      20      105     110       5
21  60  78      80      55      57        8
32  12  43      57      96      98        9
41  111 23      45      65      78        12

I tried to achieve this using for loop but it is taking too long to execute:

df_max = df_final[['US','INDIA','GERMANY']].idxmax(axis=1)
for ind in df_final.index:
    column = df_max[ind]
    df_final[column][ind] = df_final[column][ind] + df_final['Threshold'][ind]

Please help me with this. Looking forward for a good solution,Thanks in advance...!!!

First solution compare maximal value per row with all values of filtered columns, then multiple mask by Threshold and add to original column:

cols = ['US','INDIA','GERMANY']
df_final[cols] += (df_final[cols].eq(df_final[cols].max(axis=1), axis=0)
                        .mul(df_final['Threshold'], axis=0))

print (df_final)
   Day   US  INDIA  JAPAN  GERMANY  AUSTRALIA  Threshold
0   11   40     30     20      105        110          5
1   21   60     78     80       55         57          8
2   32   12     43     57       96         98          9
3   41  111     23     45       65         78         12

Or use numpy - get columns names by idxmax , compare by array from list cols , multiple and add to original columns:

cols = ['US','INDIA','GERMANY']

df_final[cols] += ((np.array(cols) == df_final[cols].idxmax(axis=1).to_numpy()[:, None]) * 
                     df_final['Threshold'].to_numpy()[:, None])

print (df_final)
   Day   US  INDIA  JAPAN  GERMANY  AUSTRALIA  Threshold
0   11   40     30     20      105        110          5
1   21   60     78     80       55         57          8
2   32   12     43     57       96         98          9
3   41  111     23     45       65         78         12

There is difference of solutions if multiple maximum values per rows.

First solution add threshold to all maximum, second solution to first maximum.

print (df_final)
   Day  US  INDIA  JAPAN  GERMANY  AUSTRALIA  Threshold
0   11  40    100     20      100        110          5 <-changed data double 100
1   21  60     70     80       55         57          8
2   32  12     43     57       87         98          9
3   41  99     23     45       65         78         12


cols = ['US','INDIA','GERMANY']
df_final[cols] += (df_final[cols].eq(df_final[cols].max(axis=1), axis=0)
                        .mul(df_final['Threshold'], axis=0))

print (df_final)
   Day   US  INDIA  JAPAN  GERMANY  AUSTRALIA  Threshold
0   11   40    105     20      105        110          5
1   21   60     78     80       55         57          8
2   32   12     43     57       96         98          9
3   41  111     23     45       65         78         12

cols = ['US','INDIA','GERMANY']

df_final[cols] += ((np.array(cols) == df_final[cols].idxmax(axis=1).to_numpy()[:, None]) * 
                     df_final['Threshold'].to_numpy()[:, None])

print (df_final)
   Day   US  INDIA  JAPAN  GERMANY  AUSTRALIA  Threshold
0   11   40    105     20      100        110          5
1   21   60     78     80       55         57          8
2   32   12     43     57       96         98          9
3   41  111     23     45       65         78         12

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM