How to count consecutive string values of one column grouped by column values of another in a dataframe?

Question

I have the following dataframe:


|Levels|Labels|Confidence|
|----------------------------
|0.    | Hands |  0.8
|0     |Leg    |  0.7    
|0     |Eye.   | 0.9
|1     |Ear    |0.9
|1     |Eye.   |0.8
|2     |Hands  |0.9
|2     |Eye.   |0.8
|3.    |Eye.   |0.8
:
:
:

I want to check if any of my labels are consecutively present in my levels (0,1,2,3,4,5..)and for how many consecutive levels (count of such consecutive levels for each of my bodyparts). Here is my example dataset, you can see that the label "Eye" is consecutively present for 4 levels, "Hands" for 1..etc.

There is a similar question here: How to find the count of consecutive same string values in a pandas dataframe?
Modifying this solution there did not work for me. I also tried to convert this into a NumPy array which also did not work.

Could you take a look at this?

Answer 1

This should work. Just define custom aggregating function.

import pandas as pd

df = pd.DataFrame({
    'lvl': [0, 0, 0, 1, 1, 2, 2, 3, 3, 3, 4],
    'label': ['a', 'b', 'c', 'a', 'b', 'a', 'c', 'a', 'b', 'c', 'c'],
    'confidence': [0.1, 0.5, 0.3, 0.6, 0.2, 0.4, 0.7, 0.8, 0.5, 0.2, 0.8]
})


agg_func = {
    'lvl': [('length', lambda x: x.ne((x+1).shift()).cumsum().value_counts().max())]
}

result = df.groupby('label').agg(agg_func)
result.columns = result.columns.droplevel(0)

print(result)

       length
label        
a           4
b           2
c           3

How to count consecutive string values of one column grouped by column values of another in a dataframe?

Question

1 answers

solution1
1 ACCPTED 2021-02-07 19:06:06

How to count consecutive string values of one column grouped by column values of another in a dataframe?

Question

1 answers

solution1 1 ACCPTED 2021-02-07 19:06:06

solution1
1 ACCPTED 2021-02-07 19:06:06