How to use trailing rows on a column for calculations on that same column | Pandas Python

Question

I'm trying to figure out how to compare the element of the previous row of a column to a different column on the current row in a Pandas DataFrame. For example:

data = pd.DataFrame({'a':['1','1','1','1','1'],'b':['0','0','1','0','0']})

Output:

And now I want to make a new column that asks if (data['a'] + data['b']) is greater then the previous value of that same column. Theoretically:

data['c'] = np.where(data['a']==( the previous row value of data['a'] ),min((data['b']+( the previous row value of data['c'] )),1),data['b'])

So that I can theoretically output:

   a   b   c
0  1   0   0
1  1   0   0
2  1   1   1
3  1   0   1
4  1   0   1

I'm wondering how to do this because I'm trying to recreate this excel conditional statement: =IF(A70=A69,MIN((P70+Q69),1),P70)

where data['a'] = column A and data['b'] = column P.

If anyone has any ideas on how to do this, I'd greatly appreciate your advice.

Answer 1

According to your statement: 'new column that asks if (data['a'] + data['b']) is greater then the previous value of that same column' I can suggest you to solve it by this way:

>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame({'a':['1','1','1','1','1'],'b':['0','0','1','0','3']})
>>> df
   a  b
0  1  0
1  1  0
2  1  1
3  1  0
4  1  3
>>> df['c'] = np.where(df['a']+df['b'] > df['a'].shift(1)+df['b'].shift(1), 1, 0)
>>> df
   a  b  c
0  1  0  0
1  1  0  0
2  1  1  1
3  1  0  0
4  1  3  1

But it doesn't looking for 'previous value of that same column' . If you would try to write df['c'].shift(1) in np.where() , it gonna to raise KeyError: 'c' .

How to use trailing rows on a column for calculations on that same column | Pandas Python

Question

1 answers

solution1
1 ACCPTED 2016-01-16 21:59:55

How to use trailing rows on a column for calculations on that same column | Pandas Python

Question

1 answers

solution1 1 ACCPTED 2016-01-16 21:59:55

solution1
1 ACCPTED 2016-01-16 21:59:55