increment second columns value if element in the first column equals to previous row

Question

There are similar questions to this one but what I am really asking is a bit different.

I want to know whether there is a way to implement the below code without a for loop (with map, or a columnar calculation ) if possible or fastest way possible.

I have a DataFrame(df) with m rows(>1E7) and n columns. Column j+1 is initiated with all 1s or 0s.

for i in range(len(df)):
    if df.iloc[i, j] == df.iloc[i-1, j]: 
        df.iloc[i, j+1] = df.iloc[i-1, j+1]+1

So the example output will look like:

    ... j j+1 ...
  0 ... 3  1  ...
  1 ... 4  1  ...
  2 ... 4  2  ...
  3 ... 4  3  ...
  4 ... 6  1  ...
  5 ... 6  2  ...
  6 ... 7  1  ...

Answer 1

There are definitely questions that answers this:

s = df.iloc[:,j]
blocks = s.ne(s.shift()).cumsum()
df.iloc[:,j+1]= s.groupby(blocks).cumcount() + 1

Output:

   ...  j  j+1   ...
0  ...  3    1   ...
1  ...  4    1   ...
2  ...  4    2   ...
3  ...  4    3   ...
4  ...  6    1   ...
5  ...  6    2   ...
6  ...  7    1   ...

Answer 2

Sounds like this is what you're after.

df['j+1'] = df.groupby('j').cumcount() + 1

Output:

increment second columns value if element in the first column equals to previous row

Question

2 answers

solution1
1 ACCPTED 2020-10-26 18:17:56

solution2
1 2020-10-26 18:31:26

increment second columns value if element in the first column equals to previous row

Question

2 answers

solution1 1 ACCPTED 2020-10-26 18:17:56

solution2 1 2020-10-26 18:31:26

solution1
1 ACCPTED 2020-10-26 18:17:56

solution2
1 2020-10-26 18:31:26