简体   繁体   中英

Selecting values from pandas dataframe based off of columns with min/max values in another dataframe

I have two dataframes with different quarters of the year as columns and particular locations as rows:

Temperature:

q_1 q_2 q_3 q_4
A 10 50 0 5
B 6 0 30 1
C 60 2 9 16

Precipitation

q_1 q_2 q_3 q_4
A 18 1 0 7
B 6 13 12 3
C 3 20 4 0

I am trying to create another dataframe whose columns are populated by the temperatures for the wettest/driest quarter and the precipitation of the warmest/coolest quarters for each location:

DF_new:

temp_wettest temp_driest precip_warmest precip_coolest
A 10 0 1 0
B 0 1 12 13
C 2 16 3 20

I have been trying to use idxmax:

temp = pd.DataFrame({'q_1' : [10,6,60],
                     'q_2' : [50,0,2],
                     'q_3' : [0,30,9],
                     'q_4' : [5,1,16]},index=['A','B','C'])
prec = pd.DataFrame({'q_1' : [18,6,3],
                     'q_2' : [1,13,20],
                     'q_3' : [0,12,4],
                     'q_4' : [7,3,0]},index=['A','B','C'])

DF_new = pd.DataFrame({'temp_wettest': temp[prec.idxmax(axis=1)],
                       'temp_driest' : temp[prec.idxmin(axis=1)],
                       'precip_warmest': prec[temp.idxmax(axis=1)],
                       'precip_coolest': prec[temp.idxmin(axis=1)]},index=['A','B','C'])

<OUT>
    temp_wettest    temp_driest precip_warmest  precip_coolest
A      (q, _, 1)      (q, _, 3)      (q, _, 2)       (q, _, 3)
B      (q, _, 2)      (q, _, 4)      (q, _, 3)       (q, _, 2)
C      (q, _, 2)      (q, _, 4)      (q, _, 1)       (q, _, 2)

I get why idxmax isn't working (it's just passing in a list of column names), but I'm not sure how to get the actual values into the new dataframe. I've also tried using pd.apply(), but I'm unsure of what function to use.

Thanks

If you have Pandas version < 1.2.0, try lookup :

DF_new = pd.DataFrame({'temp_wettest': temp.lookup(prec.index, prec.idxmax(axis=1)),
                       'temp_driest' : temp.lookup(prec.index, prec.idxmin(axis=1)),
                       'precip_warmest': prec.lookup(temp.index, temp.idxmax(1)),
                       'precip_coolest': prec.lookup(temp.index, temp.idxmin(1))
                      })

Output:

   temp_wettest  temp_driest  precip_warmest  precip_coolest
0            10            0               1               0
1             0            1              12              13
2             2           16               3              20

I have two dataframes with different quarters of the year as columns and particular locations as rows:

Temperature:

q_1 q_2 q_3 q_4
A 10 50 0 5
B 6 0 30 1
C 60 2 9 16

Precipitation

q_1 q_2 q_3 q_4
A 18 1 0 7
B 6 13 12 3
C 3 20 4 0

I am trying to create another dataframe whose columns are populated by the temperatures for the wettest/driest quarter and the precipitation of the warmest/coolest quarters for each location:

DF_new:

temp_wettest temp_driest precip_warmest precip_coolest
A 10 0 1 0
B 0 1 12 13
C 2 16 3 20

I have been trying to use idxmax:

temp = pd.DataFrame({'q_1' : [10,6,60],
                     'q_2' : [50,0,2],
                     'q_3' : [0,30,9],
                     'q_4' : [5,1,16]},index=['A','B','C'])
prec = pd.DataFrame({'q_1' : [18,6,3],
                     'q_2' : [1,13,20],
                     'q_3' : [0,12,4],
                     'q_4' : [7,3,0]},index=['A','B','C'])

DF_new = pd.DataFrame({'temp_wettest': temp[prec.idxmax(axis=1)],
                       'temp_driest' : temp[prec.idxmin(axis=1)],
                       'precip_warmest': prec[temp.idxmax(axis=1)],
                       'precip_coolest': prec[temp.idxmin(axis=1)]},index=['A','B','C'])

<OUT>
    temp_wettest    temp_driest precip_warmest  precip_coolest
A      (q, _, 1)      (q, _, 3)      (q, _, 2)       (q, _, 3)
B      (q, _, 2)      (q, _, 4)      (q, _, 3)       (q, _, 2)
C      (q, _, 2)      (q, _, 4)      (q, _, 1)       (q, _, 2)

I get why idxmax isn't working (it's just passing in a list of column names), but I'm not sure how to get the actual values into the new dataframe. I've also tried using pd.apply(), but I'm unsure of what function to use.

Thanks

I dont know the best alternative for lookup , but this might work.

DF_new = pd.DataFrame({'temp_wettest': temp.stack().loc[list(map(tuple,prec.idxmax(axis=1).reset_index().to_numpy()))].tolist(),
                       'temp_driest' : temp.stack().loc[list(map(tuple,prec.idxmin(axis=1).reset_index().to_numpy()))].tolist(),
                       'precip_warmest': prec.stack().loc[list(map(tuple,temp.idxmax(axis=1).reset_index().to_numpy()))].tolist(),
                       'precip_coolest': prec.stack().loc[list(map(tuple,temp.idxmin(axis=1).reset_index().to_numpy()))].tolist()})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM