简体   繁体   English

按条件分组 pandas dataframe

[英]Group by condition in pandas dataframe

I want to cut the continues data into some group.我想将继续数据分成一些组。 I have some data like this:我有一些这样的数据:

Index Age Predict
0     23    0
1     39    0
2     70    0
3     41    1
4     50    0
5     17    0
6     29    1

I try:我尝试:

df_1 = df[['Age','Predict']]
data = df_1.sort_values(by='Age')

After sorting:排序后:

Index Age Predict
5     17    0
0     23    0
6     29    1
1     39    0
3     41    1
2     70    0
4     50    0

What can i do to classifier data into the group:我可以做些什么来将数据分类到组中:

Index Age Predict

group 1:
5     17    0
0     23    0

group 2:
6     29    1

group 3:
1     39    0

group 4:
3     41    1

group 5:
2     70    0
4     50    0

Thanks for help.感谢帮助。

IIUC, the groups you want are created from Predict, where diff between following rows are not equal to 0. so you could create a column: IIUC,您想要的组是从预测创建的,其中以下行之间的diff不等于 0。因此您可以创建一列:

data_ = df.sort_values('Age')
data_['gr'] = data_['Predict'].diff().ne(0).cumsum()
print (data_)
   Index  Age  Predict  gr
5      5   17        0   1
0      0   23        0   1
6      6   29        1   2
1      1   39        0   3
3      3   41        1   4
4      4   50        0   5
2      2   70        0   5

Or if you want to split your data and not create the group column, one way is to create a dictionary that contains each group或者,如果您想拆分数据而不创建组列,一种方法是创建一个包含每个组的字典

data_ = df.sort_values('Age')
d = {i: dfg 
     for i,(_, dfg) in enumerate(data_.groupby(data_['Predict'].diff().ne(0).cumsum()),1)}
print (d[1])
   Index  Age  Predict
5      5   17        0
0      0   23        0
df.groupby((df['Predict'] != df['Predict'].shift(1)).cumsum())

Basically check if the current value is not the same previous value, if not increment.基本上检查当前值是否与之前的值不同,如果不是增量。 This will allow you to group by the change in values of Predict这将允许您按预测值的变化进行分组

Using .grouby and .cumsum()使用.grouby.cumsum()

for i, grp in data.groupby([(data['Predict'] != data['Predict'].shift()).cumsum()]):
    print('group', i)
    print(grp)

Result:结果:

group 1
   Age  Predict
5   17        0
0   23        0
group 2
   Age  Predict
6   29        1
group 3
   Age  Predict
1   39        0
group 4
   Age  Predict
3   41        1
group 5
   Age  Predict
4   50        0
2   70        0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM