简体   繁体   中英

Column formula dependent on previous row value and conditional on a separate column in Python

I am trying to create a new column in Python where the column value is conditional on a different column as well as depended on the previous row of he same column in the dataframe. The new column can be interpreted as an incremental time period that restarts with a new data field.

My desired output is: if the data field is equal to the previous data field, the new column is equal to 1. If not, the new column value is previous row value + 1.

In Excel, the formula looks like the below: 
=IF(A2=A1,C1+1,1)

Below is my data:

Data    Random_Columns
A   Random
A   Random
A   Random
A   Random
B   Random
B   Random
B   Random
B   Random
B   Random
B   Random
C   Random
C   Random
C   Random

Below is how I want my new column to look like:

Data    Random_Columns  New_Column
A   Random  1
A   Random  2
A   Random  3
A   Random  4
B   Random  1
B   Random  2
B   Random  3
B   Random  4
B   Random  5
B   Random  6
C   Random  1
C   Random  2
C   Random  3

Every time the sorted dataframe starts a new different value, the new column should refresh and restart its incremental counter from 1.

From other questions, I believe that we could be using the "shift" function, but have not been successful in getting the desired output.

try this, Create a NewCol with default value followed by DataFrame.groupby , Series.cumsum on each group.

df['NewCol'] = (
    df.assign(NewCol=1).groupby('Data').transform('cumsum')
)

   Data  NewCol
0     A       1
1     A       2
2     A       3
3     A       4
4     B       1
5     B       2
6     B       3
7     B       4
8     B       5
9     B       6
10    C       1
11    C       2
12    C       3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM