简体   繁体   English

查找多个数字列的最大值和最小值,并返回 dataframe 和相应的行值

[英]Find max and min value for several numeric column and return dataframe with the corresponding row value

I have the following dataset我有以下数据集

在此处输入图像描述

For each year column, I would like to find the max and min values and return both the 'max' and 'min' values together with the corresponding 'Geo' value for each.对于每一年的列,我想找到最大值和最小值,并返回“最大值”和“最小值”以及每个值对应的“地理”值。

For instance, for '1950', '1951', and so on, I would like to produce a dataframe like this one:例如,对于“1950”、“1951”等,我想生成如下所示的 dataframe:

在此处输入图像描述

This is a similar thread, but the suggested approaches there don't seem to work because my columns have numeric headers, plus my desired result is slightly different. 是一个类似的线程,但那里建议的方法似乎不起作用,因为我的列有数字标题,而且我想要的结果略有不同。

Any advice would be helpful.任何意见将是有益的。 Thanks.谢谢。

This should work but it surely exists a better solution.这应该可行,但肯定存在更好的解决方案。 I supposed your initial dataframe was a pandas dataframe named df.我假设你最初的 dataframe 是一个名为 df 的 pandas dataframe。

dff = pd.DataFrame({'row_labels':['Max_value','Max_geo','Min_value','Min_geo']})

for col in df.columns[2:]: #start at column 1950
    col_list = []
    col_list.append(df[col].min())
    col_list.append(df.loc[df[col] == df[col].min(),'Geo'].values[0])
    col_list.append(df[col].max())
    col_list.append(df.loc[df[col] == df[col].max(),'Geo'].values[0])

    dff[col] = col_list

dff.set_index('row_labels', inplace = True, drop = True)

    

You can do this without having to loop or do any value comparisons to find the max, using max , min , idxmax and idxmin as follows (assuming your dataframe is df ):您可以使用maxminidxmaxidxmin执行此操作,而无需循环或进行任何值比较来查找最大值,如下所示(假设您的 dataframe 是df ):

(df.melt(id_vars='Geo', var_name='year')
   .set_index('geo')
   .groupby('year')
   .agg({'value': ('max', 'idxmax', 'min', 'idxmin')})
   .T)

You can use df.set_index with stack and Groupby.agg :您可以将df.set_indexstackGroupby.agg一起使用:

In [1915]: df = pd.DataFrame({'Geo':['Afghanistan', 'Albania', 'Algeria', 'Angola'], 'Geo code':[4,8,12,24], '1950':[27.638, 54.191, 42.087, 35.524], '1951':[27.878, 54.399, 42.282, 35.599]})

In [1914]: df
Out[1914]: 
           Geo  Geo code    1950    1951
0  Afghanistan         4  27.638  27.878
1      Albania         8  54.191  54.399
2      Algeria        12  42.087  42.282
3       Angola        24  35.524  35.599

In [1916]: x = df.set_index('Geo').stack().reset_index(level=1, name='value').query('level_1 != "Geo code"')

In [1917]: res = x.groupby('level_1').agg({'value': ('max', 'idxmax', 'min', 'idxmin')}).T

In [1918]: res
Out[1918]: 
level_1              1950         1951
value max          54.191       54.399
      idxmax      Albania      Albania
      min          27.638       27.878
      idxmin  Afghanistan  Afghanistan

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas Dataframe:groupby id查找最大列值并返回另一列的对应值 - Pandas Dataframe: groupby id to find max column value and return corresponding value of another column Groupby最大值并在pandas数据框中返回对应的行 - Groupby max value and return corresponding row in pandas dataframe 如何在单个列中的几个最大值之间找到最小值? - How to find min value between several max values in a single column? 在整个数据框中查找最大值和相应的列/索引名称 - Find max value and the corresponding column/index name in entire dataframe 在 Pandas DataFrame 中查找具有每天最小值/最大值的行 - Find row with min/max value for each day in Pandas DataFrame 查找并返回最大值的行和最大值的行作为数组 - Find and return max value for a column and the row of the max value as an array 查找条目为 arrays 的 DataFrame 列的最小最大值 - Find the min max value of a column of a DataFrame whose entries are arrays 计数并查找最小值,最大值出现在数据框列中 - Count and Find Min, Max of value occurs in a dataframe column 使用 Pandas 查找列的最大值并返回相应的行值 - Find maximum value of a column and return the corresponding row values using Pandas 如何在数据框中找到具有最小值的列的第一行 - How to find the first row with min value of a column in dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM