简体   繁体   中英

How can I combine chronologically consecutive rows based on a condition in pandas?

I have a dataset that looks like this:

   begin   end   type
0  1       3     A
1  3       7     A
2  7       9     A
3  9      15     B
4  15     17     B
5  17     20     A

I would like to group rows that are from the same type and that are chronologically consecutive, such that the resulting DataFrame looks like this:

   begin   end   type
0  1       9     A
1  9      17     B
2  17     20     A

I could, of course, write a function that checks each row and looks rows ahead until a different type is found, but I feel there must be an easier way. What would be the most pandas way to do this?

I have seen other similar questions , but none of them really apply to my case.

Given your data, you can shift the end column and compare that to begin :

groups = df.groupby('type')['end'].shift().ne(df['begin']).cumsum()
(df.groupby(['type', groups])
   .agg({'begin':'first', 'end':'last'})
)

Output:

        begin  end
type              
A    1      1    9
     3     17   20
B    2      9   17

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM