a solution for filtering some rows of data based on condition in pandas

Question

I have the following example data, and I'd like to filter a piece of data, when (col1 = 'A' and col2 = '0') we want to keep rows until next (col1 = 'A') .
I want to do using pandas dataframe but I don't know how it is.

df = pd.DataFrame({'col1': ['A', 'B', 'C'],  'col2': [0, 1]})

For example, we have this data

col1 col2
 A    0
 C
 A    1 
 B
 C
 A    1 
 B
 B
 C
 A    0 
 B 
 C
 A    1 
 B 
 C
 C

The result I want to achieve is:

col1 col2 
 A    0 
 C 
 A    0 
 B 
 C

Thank you very much

Answer 1

We first groupby row blocks starting with 'A' and then propagate the first value of col2 to all rows of the group. From this result we take all rows with 0 in col2 .

 df[df.groupby(df.col1.eq('A').cumsum()).col2.transform('first').eq(0)]

Sample data:

df = pd.DataFrame({'col1': list('ACABCABBCABCABCC'),
                   'col2': [0, None, 1, None, None, 1, None, None, None, 0, None, None, 1, None, None, None]}
                 ).astype({'col2': 'Int32'})

Result:

   col1  col2
0     A     0
1     C  <NA>
9     A     0
10    B  <NA>
11    C  <NA>

a solution for filtering some rows of data based on condition in pandas

Question

1 answers

solution1
4 ACCPTED 2020-05-31 09:34:24

a solution for filtering some rows of data based on condition in pandas

Question

1 answers

solution1 4 ACCPTED 2020-05-31 09:34:24

solution1
4 ACCPTED 2020-05-31 09:34:24