如何按列分组，然后在 python 中的组内重新排序列

Question

I have the following grouped dataframe:我有以下分组 dataframe：

                 Value1      Value2

    Category   
------------------------------------   
0          0         62          44 
           1         55          46 
           2         73          75 
1          0         61          49 
           1         55          46 
           2         34          35  
2          0         62          48 
           1         55          46 
           2         44          25

I want to, for each group, reorder the "Value1" column as ascending, while keeping the order of the "Category" column.对于每个组，我想将“Value1”列重新排序为升序，同时保持“Category”列的顺序。 The goal is that the "Category" 0 will correspond to the lowest "Value1" value and "Category" 5 will correspond to the highest "Value1" value.目标是“类别”0 将对应于最低的“Value1”值，“类别”5 将对应于最高的“Value1”值。 "Value2" values will correspond to the original "Value1" value they corresponded to. “Value2”值将对应于它们对应的原始“Value1”值。 This is the output dataframe I want to produce:这是我要生产的output dataframe：

                 Value1      Value2

    Category   
------------------------------------   
0          0         55          46    
           1         62          44
           2         73          75                 
1          0         34          35
           1         55          46  
           2         61          49
2          0         44          25
           1         55          46 
           2         62          48

How can I accomplish this in python?如何在 python 中完成此操作？ I have tried using .reset_index() and `.sort_values(), but I am just not getting the grouped dataframe I want.我曾尝试使用.reset_index()和 `.sort_values()，但我没有得到我想要的分组 dataframe。 I tried:我试过了：

df.sort_values(['Value1'],ascending=True).groupby('Category')

but this just produces: <pandas.core.groupby.generic.DataFrameGroupBy object at...> which is not useful.但这只会产生： <pandas.core.groupby.generic.DataFrameGroupBy object at...>这没什么用。

Answer 1

One way using sort_values with index name:使用带有索引名称的sort_values的一种方法：

tmp = df.index.names
df.index.names = ["tmp", "Category"]
new_df = df.sort_values(["tmp", "Value1"])
new_df.index = df.index.rename(tmp)
print(new_df)

OUtput: OUtput：

            Value1  Value2
  Category                
0 0             55      46
  1             62      44
  2             73      75
1 0             34      35
  1             55      46
  2             61      49
2 0             44      25
  1             55      46
  2             62      48

Answer 2

You can sort the dataframe on the Values and the first level of index:您可以在 Values 和第一级索引上对 dataframe 进行排序：

>>> df = (df.sort_values(by=['Value1', 'Value2'])
            .sort_index(level=0, sort_remaining=False)
          )

            Value1  Value2
  Category                
0 1             55      46
  0             62      44
  2             73      75
1 2             34      35
  1             55      46
  0             61      49
2 2             44      25
  1             55      46
  0             62      48

Then you need to rewrite the level1 using a cumcount per group:然后您需要使用每组的cumcount重写 level1：

df.sort_values(by=['Value1', 'Value2']).sort_index(level=0, sort_remaining=False)
idx = pd.MultiIndex.from_arrays([df.index.get_level_values(0),
                                 pd.Series(range(len(df))).groupby(df.index.get_level_values(0)).cumcount()],
                                names=(None, 'Category')
                                )
df.index = idx

output: output：

            Value1  Value2
  Category                
0 0             55      46
  1             62      44
  2             73      75
1 0             34      35
  1             55      46
  2             61      49
2 0             44      25
  1             55      46
  2             62      48

Answer 3

You can apply it as follows:您可以按如下方式应用它：

import pandas as pd

df = pd.DataFrame({'col1': [0, 1, 2, 0, 1, 2], 'col2': [8, 9, 6, 40, 3, 20], 'col3': [5, 6, 0, 40, 3, 20]})
sorted_df = df.sort_values(['col2'], ascending=True)
df[['col2', 'col3']] = sorted_df[['col2', 'col3']].values
print(df)

Output: Output：

   col1  col2  col3
0     0     3     3
1     1     6     0
2     2     8     5
3     0     9     6
4     1    20    20
5     2    40    40

Answer 4

One line solution should be DataFrame.rename_axis with DataFrame.sort_values and DataFrame.set_index :一行解决方案应该是DataFrame.rename_axis与DataFrame.sort_values和DataFrame.set_index ：

df = df.rename_axis(index={None:'tmp'}).sort_values(['tmp', "Value1"]).set_index(df.index)
print (df)
            Value1  Value2
  Category                
0 0             55      46
  1             62      44
  2             73      75
1 0             34      35
  1             55      46
  2             61      49
2 0             44      25
  1             55      46
  2             62      48

如何按列分组，然后在 python 中的组内重新排序列

问题描述

4 个解决方案

解决方案1
1 已采纳 2021-11-24 07:47:32

解决方案2
0 2021-11-24 07:56:51

解决方案3
0 2021-11-24 08:05:29

解决方案4
0 2021-11-24 08:24:57

如何按列分组，然后在 python 中的组内重新排序列

问题描述

4 个解决方案

解决方案1 1 已采纳 2021-11-24 07:47:32

解决方案2 0 2021-11-24 07:56:51

解决方案3 0 2021-11-24 08:05:29

解决方案4 0 2021-11-24 08:24:57

解决方案1
1 已采纳 2021-11-24 07:47:32

解决方案2
0 2021-11-24 07:56:51

解决方案3
0 2021-11-24 08:05:29

解决方案4
0 2021-11-24 08:24:57