[英]Selecting values from pandas dataframe based off of columns with min/max values in another dataframe
I have two dataframes with different quarters of the year as columns and particular locations as rows:我有两个数据框,一年中不同的季度作为列,特定位置作为行:
Temperature:温度:
q_1 ![]() |
q_2 ![]() |
q_3 ![]() |
q_4 ![]() |
|
---|---|---|---|---|
A![]() |
10 ![]() |
50 ![]() |
0 ![]() |
5 ![]() |
B![]() |
6 ![]() |
0 ![]() |
30 ![]() |
1 ![]() |
C ![]() |
60 ![]() |
2 ![]() |
9 ![]() |
16 ![]() |
Precipitation沉淀
q_1 ![]() |
q_2 ![]() |
q_3 ![]() |
q_4 ![]() |
|
---|---|---|---|---|
A![]() |
18 ![]() |
1 ![]() |
0 ![]() |
7 ![]() |
B![]() |
6 ![]() |
13 ![]() |
12 ![]() |
3 ![]() |
C ![]() |
3 ![]() |
20 ![]() |
4 ![]() |
0 ![]() |
I am trying to create another dataframe whose columns are populated by the temperatures for the wettest/driest quarter and the precipitation of the warmest/coolest quarters for each location:我正在尝试创建另一个 dataframe,其列由最潮湿/最干燥季度的温度和每个位置最温暖/最凉爽季度的降水量填充:
DF_new: DF_new:
temp_wettest ![]() |
temp_driest![]() |
precip_warmest ![]() |
precip_coolest![]() |
|
---|---|---|---|---|
A![]() |
10 ![]() |
0 ![]() |
1 ![]() |
0 ![]() |
B![]() |
0 ![]() |
1 ![]() |
12 ![]() |
13 ![]() |
C ![]() |
2 ![]() |
16 ![]() |
3 ![]() |
20 ![]() |
I have been trying to use idxmax:我一直在尝试使用 idxmax:
temp = pd.DataFrame({'q_1' : [10,6,60],
'q_2' : [50,0,2],
'q_3' : [0,30,9],
'q_4' : [5,1,16]},index=['A','B','C'])
prec = pd.DataFrame({'q_1' : [18,6,3],
'q_2' : [1,13,20],
'q_3' : [0,12,4],
'q_4' : [7,3,0]},index=['A','B','C'])
DF_new = pd.DataFrame({'temp_wettest': temp[prec.idxmax(axis=1)],
'temp_driest' : temp[prec.idxmin(axis=1)],
'precip_warmest': prec[temp.idxmax(axis=1)],
'precip_coolest': prec[temp.idxmin(axis=1)]},index=['A','B','C'])
<OUT>
temp_wettest temp_driest precip_warmest precip_coolest
A (q, _, 1) (q, _, 3) (q, _, 2) (q, _, 3)
B (q, _, 2) (q, _, 4) (q, _, 3) (q, _, 2)
C (q, _, 2) (q, _, 4) (q, _, 1) (q, _, 2)
I get why idxmax isn't working (it's just passing in a list of column names), but I'm not sure how to get the actual values into the new dataframe. I've also tried using pd.apply(), but I'm unsure of what function to use.我明白为什么 idxmax 不起作用(它只是传递列名列表),但我不确定如何将实际值放入新的 dataframe 中。我也尝试过使用 pd.apply(),但是我不确定要使用什么 function。
Thanks谢谢
If you have Pandas version < 1.2.0, try lookup
:如果您的 Pandas 版本 < 1.2.0,请尝试
lookup
:
DF_new = pd.DataFrame({'temp_wettest': temp.lookup(prec.index, prec.idxmax(axis=1)),
'temp_driest' : temp.lookup(prec.index, prec.idxmin(axis=1)),
'precip_warmest': prec.lookup(temp.index, temp.idxmax(1)),
'precip_coolest': prec.lookup(temp.index, temp.idxmin(1))
})
Output: Output:
temp_wettest temp_driest precip_warmest precip_coolest
0 10 0 1 0
1 0 1 12 13
2 2 16 3 20
I have two dataframes with different quarters of the year as columns and particular locations as rows:我有两个数据框,将一年中不同的季度作为列,将特定位置作为行:
Temperature:温度:
q_1 ![]() |
q_2 ![]() |
q_3 ![]() |
q_4 ![]() |
|
---|---|---|---|---|
A![]() |
10 ![]() |
50 ![]() |
0 ![]() |
5 ![]() |
B![]() |
6 ![]() |
0 ![]() |
30 ![]() |
1 ![]() |
C ![]() |
60 ![]() |
2 ![]() |
9 ![]() |
16 ![]() |
Precipitation沉淀
q_1 ![]() |
q_2 ![]() |
q_3 ![]() |
q_4 ![]() |
|
---|---|---|---|---|
A![]() |
18 ![]() |
1 ![]() |
0 ![]() |
7 ![]() |
B![]() |
6 ![]() |
13 ![]() |
12 ![]() |
3 ![]() |
C ![]() |
3 ![]() |
20 ![]() |
4 ![]() |
0 ![]() |
I am trying to create another dataframe whose columns are populated by the temperatures for the wettest/driest quarter and the precipitation of the warmest/coolest quarters for each location:我正在尝试创建另一个 dataframe 其列由最潮湿/最干燥季度的温度和每个位置的最热/最冷季度的降水量填充:
DF_new: DF_新:
temp_wettest ![]() |
temp_driest ![]() |
precip_warmest ![]() |
precip_coolest ![]() |
|
---|---|---|---|---|
A![]() |
10 ![]() |
0 ![]() |
1 ![]() |
0 ![]() |
B![]() |
0 ![]() |
1 ![]() |
12 ![]() |
13 ![]() |
C ![]() |
2 ![]() |
16 ![]() |
3 ![]() |
20 ![]() |
I have been trying to use idxmax:我一直在尝试使用 idxmax:
temp = pd.DataFrame({'q_1' : [10,6,60],
'q_2' : [50,0,2],
'q_3' : [0,30,9],
'q_4' : [5,1,16]},index=['A','B','C'])
prec = pd.DataFrame({'q_1' : [18,6,3],
'q_2' : [1,13,20],
'q_3' : [0,12,4],
'q_4' : [7,3,0]},index=['A','B','C'])
DF_new = pd.DataFrame({'temp_wettest': temp[prec.idxmax(axis=1)],
'temp_driest' : temp[prec.idxmin(axis=1)],
'precip_warmest': prec[temp.idxmax(axis=1)],
'precip_coolest': prec[temp.idxmin(axis=1)]},index=['A','B','C'])
<OUT>
temp_wettest temp_driest precip_warmest precip_coolest
A (q, _, 1) (q, _, 3) (q, _, 2) (q, _, 3)
B (q, _, 2) (q, _, 4) (q, _, 3) (q, _, 2)
C (q, _, 2) (q, _, 4) (q, _, 1) (q, _, 2)
I get why idxmax isn't working (it's just passing in a list of column names), but I'm not sure how to get the actual values into the new dataframe.我明白为什么 idxmax 不起作用(它只是传入列名列表),但我不确定如何将实际值输入新的 dataframe。 I've also tried using pd.apply(), but I'm unsure of what function to use.
我也尝试过使用 pd.apply(),但我不确定要使用什么 function。
Thanks谢谢
I dont know the best alternative for lookup
, but this might work.我不知道
lookup
的最佳替代方案,但这可能有效。
DF_new = pd.DataFrame({'temp_wettest': temp.stack().loc[list(map(tuple,prec.idxmax(axis=1).reset_index().to_numpy()))].tolist(),
'temp_driest' : temp.stack().loc[list(map(tuple,prec.idxmin(axis=1).reset_index().to_numpy()))].tolist(),
'precip_warmest': prec.stack().loc[list(map(tuple,temp.idxmax(axis=1).reset_index().to_numpy()))].tolist(),
'precip_coolest': prec.stack().loc[list(map(tuple,temp.idxmin(axis=1).reset_index().to_numpy()))].tolist()})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.