根据连续排序值按 DataFrame 分组

Question

I'm trying to group a dataframe based on order of values.我正在尝试根据值的顺序对 dataframe 进行分组。 Here is my sample code:这是我的示例代码：

import pandas as pd

df = pd.DataFrame([{'c1': 'v1', 'c2': 1},
               {'c1': 'v1', 'c2': 2},
               {'c1': 'v2', 'c2': 3},
               {'c1': 'v1', 'c2': 4},
               {'c1': 'v2', 'c2': 5},
               {'c1': 'v2', 'c2': 6},
               {'c1': 'v3', 'c2': 7}])
df['test'] = 'test'
df1 = df.groupby(['test', 'c1'])['c2'].describe()[['min', 'max']]
print(df1)

here is the result:这是结果：

         min  max
test c1          
test v1  1.0  4.0
     v2  3.0  6.0
     v3  7.0  7.0

but i'm looking for the possibility to get following result:但我正在寻找获得以下结果的可能性：

         min  max
test c1          
test v1  1.0  2.0
     v2  3.0  3.0
     v1  4.0  4.0
     v2  5.0  6.0
     v3  7.0  7.0

Answer 1

Use:采用：

df1 = df.groupby(['test', 'c1', df.c1.ne(df.c1.shift()).cumsum()]).c2.describe()[['min', 'max']].droplevel(2)

result:结果：

         min  max
test c1          
test v1  1.0  2.0
     v1  4.0  4.0
     v2  3.0  3.0
     v2  5.0  6.0
     v3  7.0  7.0

Note usage of pandas.MultiIndex.droplevel method at the end of transformations, which removes level from dataframe multiindex.注意在转换结束时使用pandas.MultiIndex.droplevel方法，它从 dataframe 多索引中删除级别。

Answer 2

IIUC you need to group by consecutive c1 : IIUC 你需要按连续的c1分组：

df1 = (df.assign(group=df["c1"].ne(df["c1"].shift()).cumsum())
         .groupby(['test', 'c1', "group"])['c2'].describe()[['min', 'max']]
         .sort_index(level=2))

print(df1)

               min  max
test c1 group          
test v1 1      1.0  2.0
     v2 2      3.0  3.0
     v1 3      4.0  4.0
     v2 4      5.0  6.0
     v3 5      7.0  7.0

根据连续排序值按 DataFrame 分组

问题描述

2 个解决方案

解决方案1
2 已采纳 2020-08-20 10:01:35

解决方案2
1 2020-08-20 09:59:28

根据连续排序值按 DataFrame 分组

问题描述

2 个解决方案

解决方案1 2 已采纳 2020-08-20 10:01:35

解决方案2 1 2020-08-20 09:59:28

解决方案1
2 已采纳 2020-08-20 10:01:35

解决方案2
1 2020-08-20 09:59:28