简体   繁体   English

从 pandas dataframe 中选择值,基于另一个 dataframe 中具有最小值/最大值的列

[英]Selecting values from pandas dataframe based off of columns with min/max values in another dataframe

I have two dataframes with different quarters of the year as columns and particular locations as rows:我有两个数据框,一年中不同的季度作为列,特定位置作为行:

Temperature:温度:

q_1 q_1 q_2 q_2 q_3 q_3 q_4 q_4
A一种 10 10 50 50 0 0 5 5个
B 6 6个 0 0 30 30 1 1个
C C 60 60 2 2个 9 9 16 16

Precipitation沉淀

q_1 q_1 q_2 q_2 q_3 q_3 q_4 q_4
A一种 18 18 1 1个 0 0 7 7
B 6 6个 13 13 12 12 3 3个
C C 3 3个 20 20 4 4个 0 0

I am trying to create another dataframe whose columns are populated by the temperatures for the wettest/driest quarter and the precipitation of the warmest/coolest quarters for each location:我正在尝试创建另一个 dataframe,其列由最潮湿/最干燥季度的温度和每个位置最温暖/最凉爽季度的降水量填充:

DF_new: DF_new:

temp_wettest temp_wettest temp_driest最干燥的温度 precip_warmest precip_warmest precip_coolest降水_最酷
A一种 10 10 0 0 1 1个 0 0
B 0 0 1 1个 12 12 13 13
C C 2 2个 16 16 3 3个 20 20

I have been trying to use idxmax:我一直在尝试使用 idxmax:

temp = pd.DataFrame({'q_1' : [10,6,60],
                     'q_2' : [50,0,2],
                     'q_3' : [0,30,9],
                     'q_4' : [5,1,16]},index=['A','B','C'])
prec = pd.DataFrame({'q_1' : [18,6,3],
                     'q_2' : [1,13,20],
                     'q_3' : [0,12,4],
                     'q_4' : [7,3,0]},index=['A','B','C'])

DF_new = pd.DataFrame({'temp_wettest': temp[prec.idxmax(axis=1)],
                       'temp_driest' : temp[prec.idxmin(axis=1)],
                       'precip_warmest': prec[temp.idxmax(axis=1)],
                       'precip_coolest': prec[temp.idxmin(axis=1)]},index=['A','B','C'])

<OUT>
    temp_wettest    temp_driest precip_warmest  precip_coolest
A      (q, _, 1)      (q, _, 3)      (q, _, 2)       (q, _, 3)
B      (q, _, 2)      (q, _, 4)      (q, _, 3)       (q, _, 2)
C      (q, _, 2)      (q, _, 4)      (q, _, 1)       (q, _, 2)

I get why idxmax isn't working (it's just passing in a list of column names), but I'm not sure how to get the actual values into the new dataframe. I've also tried using pd.apply(), but I'm unsure of what function to use.我明白为什么 idxmax 不起作用(它只是传递列名列表),但我不确定如何将实际值放入新的 dataframe 中。我也尝试过使用 pd.apply(),但是我不确定要使用什么 function。

Thanks谢谢

If you have Pandas version < 1.2.0, try lookup :如果您的 Pandas 版本 < 1.2.0,请尝试lookup

DF_new = pd.DataFrame({'temp_wettest': temp.lookup(prec.index, prec.idxmax(axis=1)),
                       'temp_driest' : temp.lookup(prec.index, prec.idxmin(axis=1)),
                       'precip_warmest': prec.lookup(temp.index, temp.idxmax(1)),
                       'precip_coolest': prec.lookup(temp.index, temp.idxmin(1))
                      })

Output: Output:

   temp_wettest  temp_driest  precip_warmest  precip_coolest
0            10            0               1               0
1             0            1              12              13
2             2           16               3              20

I have two dataframes with different quarters of the year as columns and particular locations as rows:我有两个数据框,将一年中不同的季度作为列,将特定位置作为行:

Temperature:温度:

q_1 q_1 q_2 q_2 q_3 q_3 q_4 q_4
A一个 10 10 50 50 0 0 5 5
B 6 6 0 0 30 30 1 1
C C 60 60 2 2 9 9 16 16

Precipitation沉淀

q_1 q_1 q_2 q_2 q_3 q_3 q_4 q_4
A一个 18 18 1 1 0 0 7 7
B 6 6 13 13 12 12 3 3
C C 3 3 20 20 4 4 0 0

I am trying to create another dataframe whose columns are populated by the temperatures for the wettest/driest quarter and the precipitation of the warmest/coolest quarters for each location:我正在尝试创建另一个 dataframe 其列由最潮湿/最干燥季度的温度和每个位置的最热/最冷季度的降水量填充:

DF_new: DF_新:

temp_wettest temp_wettest temp_driest temp_dry precip_warmest precip_warmest precip_coolest precip_coolest
A一个 10 10 0 0 1 1 0 0
B 0 0 1 1 12 12 13 13
C C 2 2 16 16 3 3 20 20

I have been trying to use idxmax:我一直在尝试使用 idxmax:

temp = pd.DataFrame({'q_1' : [10,6,60],
                     'q_2' : [50,0,2],
                     'q_3' : [0,30,9],
                     'q_4' : [5,1,16]},index=['A','B','C'])
prec = pd.DataFrame({'q_1' : [18,6,3],
                     'q_2' : [1,13,20],
                     'q_3' : [0,12,4],
                     'q_4' : [7,3,0]},index=['A','B','C'])

DF_new = pd.DataFrame({'temp_wettest': temp[prec.idxmax(axis=1)],
                       'temp_driest' : temp[prec.idxmin(axis=1)],
                       'precip_warmest': prec[temp.idxmax(axis=1)],
                       'precip_coolest': prec[temp.idxmin(axis=1)]},index=['A','B','C'])

<OUT>
    temp_wettest    temp_driest precip_warmest  precip_coolest
A      (q, _, 1)      (q, _, 3)      (q, _, 2)       (q, _, 3)
B      (q, _, 2)      (q, _, 4)      (q, _, 3)       (q, _, 2)
C      (q, _, 2)      (q, _, 4)      (q, _, 1)       (q, _, 2)

I get why idxmax isn't working (it's just passing in a list of column names), but I'm not sure how to get the actual values into the new dataframe.我明白为什么 idxmax 不起作用(它只是传入列名列表),但我不确定如何将实际值输入新的 dataframe。 I've also tried using pd.apply(), but I'm unsure of what function to use.我也尝试过使用 pd.apply(),但我不确定要使用什么 function。

Thanks谢谢

I dont know the best alternative for lookup , but this might work.我不知道lookup的最佳替代方案,但这可能有效。

DF_new = pd.DataFrame({'temp_wettest': temp.stack().loc[list(map(tuple,prec.idxmax(axis=1).reset_index().to_numpy()))].tolist(),
                       'temp_driest' : temp.stack().loc[list(map(tuple,prec.idxmin(axis=1).reset_index().to_numpy()))].tolist(),
                       'precip_warmest': prec.stack().loc[list(map(tuple,temp.idxmax(axis=1).reset_index().to_numpy()))].tolist(),
                       'precip_coolest': prec.stack().loc[list(map(tuple,temp.idxmin(axis=1).reset_index().to_numpy()))].tolist()})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据pandas中多列的值从Dataframe中选择行 - Selecting rows from a Dataframe based on values from multiple columns in pandas 子集根据另一个数据帧的值在多个列上进行pandas数据帧 - Subset pandas dataframe on multiple columns based on values from another dataframe 根据来自另一个数据框的数据将值分配给Pandas数据框中的列 - Assign values to columns in Pandas Dataframe based on data from another dataframe 根据pandas中多列中的值从Dataframe中选择行 - Selecting rows from a Dataframe based on values in multiple columns in pandas 根据来自另一个 DataFrame 的值更新 pandas 列中的值 - Update values in pandas columns based on values from another DataFrame 根据 pandas dataframe 中的条件获取最大值和最小值 - get max and min values based on conditions in pandas dataframe Pandas DataFrame:根据最小/最大列计算值 - Pandas DataFrame: Compute values based on column min/max 从最小值到最大值对多列的值进行排序,并将新列放入 pandas dataframe - Sort multiple columns' values from min to max, and put in new columns in pandas dataframe Pandas 根据另一个数据框中 2 列的值过滤行 - Pandas filter rows based on values from 2 columns in another dataframe 根据相同 dataframe 列表中某些列的值从 pandas dataframe 中选择行? - Selecting rows from a pandas dataframe based on the values of some columns in a list of the same dataframe?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM