How can I get the count of consecutive positive number in each column in 2 dimensional df in python/ Padas

Question

      X     y
a   1.0  -1.0
b  -2.0   2.0
c   3.0  -3.0
d   2.1   4.0

Output: 
       x     y
  a   1.0  -1.0
  b  -2.0   2.0
  c   3.0  -3.0
  d   2.1   4.0
Count 2     1

As on the first column, the count is reset to 0 on row b because of -2. The result needs to be a df with the count appended at last.

Answer 1

Let us use cumsum def your function

def yourfun(x) : 
       return x[x.ge(0)].groupby(x.lt(0).cumsum()).size().iloc[-1]
df.loc['Count'] = df.apply(yourfun)
df
Out[62]: 
         X    y
a      1.0 -1.0
b     -2.0  2.0
c      3.0 -3.0
d      2.1  4.0
Count  2.0  1.0

Answer 2

There is a pure numpy way without groupby (in other words: likely to be very fast). It also counts runs of strictly positive values (excluding 0):

def countpos(x):
    return np.diff(np.where(np.hstack((-1, x, -1)) <= 0)[0]).max() - 1

df.loc['Count'] = df.apply(countpos)

Result:

>>> df
         X    y
a      1.0 -1.0
b     -2.0  2.0
c      3.0 -3.0
d      2.1  4.0
Count  2.0  1.0

Explanation

The np.where() looks for the indices of all non-positive values. For example:

>>> np.where(np.array([0,0,1,1,0,0,1]) <= 0)
(array([0, 1, 4, 5]),)

We bracked the actual values with -1 on both sides, to force np.where() to tell us about those indices too. Then, take the diff , max , et voila: the maximum length of runs of strictly positive numbers.

Speed

df = pd.DataFrame(np.random.uniform(-1, 1, size=(10000,100)))

a = %timeit -o df.apply(countpos)
14.3 ms ± 11.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Answer 3

Here is another way:

df.loc['count'] =  df.lt(0).diff().ne(0).cumsum().stack().groupby(level=1).value_counts().groupby(level=0).max()

How can I get the count of consecutive positive number in each column in 2 dimensional df in python/ Padas

Question

3 answers

solution1
0 ACCPTED 2021-03-05 02:51:16

solution2
0 2021-03-05 03:52:23

solution3
0 2021-03-05 04:59:42

How can I get the count of consecutive positive number in each column in 2 dimensional df in python/ Padas

Question

3 answers

solution1 0 ACCPTED 2021-03-05 02:51:16

solution2 0 2021-03-05 03:52:23

solution3 0 2021-03-05 04:59:42

solution1
0 ACCPTED 2021-03-05 02:51:16

solution2
0 2021-03-05 03:52:23

solution3
0 2021-03-05 04:59:42