Iterate over rows in pandas df

Question

I have the df shown below:

           CX     CY   CS
97539   0.39896 0.7787  0
97540   0.39896 0.7787  0
97541   0.39896 0.7787  0
97542   0.39896 0.7787  0
97543   0.39896 0.7787  0
97544   0.39896 0.7787  0
97545   0.39896 0.7787  0
97546   0.39896 0.7787  0
97547   0.39896 0.7787  0
97548   0.39896 0.7787  0
97549   0.39896 0.7787  0
97550   0.39896 0.7787  0
97551   0.39896 0.7787  0
97552   0.39896 0.7787  0
97553   0.39896 0.7787  0
97554   0.39896 0.7787  0
97555   0.39896 0.7787  0
97556   0.39896 0.7787  0
97557   0.39896 0.7787  0
97558   0.39896 0.7787  0
97559   0.39896 0.7787  0
97560   0.39896 0.7787  0
97561   0.39896 0.7787  1
97562   0.39896 0.7787  0
97563   0.39896 0.7787  0
97564   0.39896 0.7787  0
97565   0.39896 0.7787  0

I want keep only the part of the df up to the point when the value on the 'CS' column becomes 1 and drop the remaining rows. So I want to have sth like this:

           CX     CY   CS
97539   0.39896 0.7787  0
97540   0.39896 0.7787  0
97541   0.39896 0.7787  0
97542   0.39896 0.7787  0
97543   0.39896 0.7787  0
97544   0.39896 0.7787  0
97545   0.39896 0.7787  0
97546   0.39896 0.7787  0
97547   0.39896 0.7787  0
97548   0.39896 0.7787  0
97549   0.39896 0.7787  0
97550   0.39896 0.7787  0
97551   0.39896 0.7787  0
97552   0.39896 0.7787  0
97553   0.39896 0.7787  0
97554   0.39896 0.7787  0
97555   0.39896 0.7787  0
97556   0.39896 0.7787  0
97557   0.39896 0.7787  0
97558   0.39896 0.7787  0
97559   0.39896 0.7787  0
97560   0.39896 0.7787  0
97561   0.39896 0.7787  1

Any ideas how to approach it? Note that the value of 1 can be at any line, so I can't just use.iloc(). Ideally, I would like to avoid itterows().

Answer 1

If there is always at least one 1 is possible compare values by Series.eq and then get index of first 1 by Series.idxmax , last filter by DataFrame.loc :

df1 = df.loc[: df['CS'].eq(1).idxmax()]

Solution working if also no 1 value - then return empty DataFrame:

m = df['CS'].eq(1)
df1 = df.loc[: m.idxmax()] if m.any() else pd.DataFrame()

Or use trick with Series.cummax in boolean indexing , only is necessary change order 2 times:

df1 = df[df['CS'].iloc[::-1].eq(1).cummax().iloc[::-1]]

Iterate over rows in pandas df

Question

1 answers

solution1
0 ACCPTED 2021-04-09 11:58:44

Iterate over rows in pandas df

Question

1 answers

solution1 0 ACCPTED 2021-04-09 11:58:44

solution1
0 ACCPTED 2021-04-09 11:58:44