简体   繁体   中英

Is there a way to store previous value in a row and change when a new condition is met?

I have a dataset which has data-time and value as columns for each ID. I do some calculations on it but stuck while using recursive functions.

This is how the dataset looks,

Date-Time     Volume      ID    Load  
10/22/2019     3862       10        
10/23/2019     3800       10        
10/24/2019     3700       10        
10/25/2019     5000       10     Yes   
10/26/2019     4900       10        
10/27/2019     4800       10        
10/22/2019     3862       11        
10/23/2019     3800       11        
10/24/2019     3700       11        
10/25/2019     5000       11     Yes        
10/26/2019     4900       11        
10/27/2019     4800       11           

I looped the ID in a different function and made a call.

This is what I tried,

curr_load = 0
def Load_number(data):
    global curr_load
    if(data['Load'] == 'Load'):
        curr_load = curr_load + 1       

    return curr_load
ids = unique(data)
    newdata = pd.DataFrame()
    for id in ids: 
        data = data.loc[data['ID'] == id]        
        data = calculations(data)
def calculations(data):
    data['Load_number'] = data.apply(Load_number, axis = 1)

Required output is,

Date-Time     Volume      ID    Load    Load_number
10/22/2019     3862       100                0
10/23/2019     3800       100                0
10/24/2019     3700       100                0
10/25/2019     5000       100     Yes        1
10/26/2019     4900       100                1
10/27/2019     4800       100                1
10/28/2019     4700       100                1
10/22/2019     3862       111                0
10/23/2019     3800       111                0
10/24/2019     3700       111                0
10/25/2019     5000       111     Yes        1
10/26/2019     4900       111                1
10/27/2019     5800       111     Yes        2   
10/28/2019     5500       111                2     
10/29/2019     50000      111                2     

And date as,

Date-Time  Volume  ID  Load        LoadDate
10/22/2019    3862  10  None          0
10/23/2019    3800  10  None          0
10/24/2019    3700  10  None          0
10/25/2019    5000  10   Yes        10/25/2019
10/26/2019    4900  10  None        10/25/2019
10/27/2019    4800  10  None        10/25/2019
10/22/2019    3862  11  None           0
10/23/2019    3800  11  None           0
10/24/2019    3700  11  None           0
10/25/2019    5000  11   Yes        10/25/2019
10/26/2019    4900  11  None        10/25/2019
10/27/2019    4800  11  None        10/25/2019

That should do it:

df['Load Number'] = np.where(df.Load == 'Yes', 1, 0) 

df['Load Number'] = df.groupby('ID')['Load Number'].cumsum()    

(Edit) Regarding your second question, you could use a similar approach:

df['LoadDate'] = np.where(df.Load == 'Yes', df['Date-Time'], np.nan)

df['LoadDate'] = df.groupby('ID')['LoadDate'].ffill().fillna(0) 

Output:

     Date-Time  Volume  ID  Load  Load Number    LoadDate
0   10/22/2019    3862  10  None            0           0
1   10/23/2019    3800  10  None            0           0
2   10/24/2019    3700  10  None            0           0
3   10/25/2019    5000  10   Yes            1  10/25/2019
4   10/26/2019    4900  10  None            1  10/25/2019
5   10/27/2019    4800  10  None            1  10/25/2019
6   10/22/2019    3862  11  None            0           0
7   10/23/2019    3800  11  None            0           0
8   10/24/2019    3700  11  None            0           0
9   10/25/2019    5000  11   Yes            1  10/25/2019
10  10/26/2019    4900  11  None            1  10/25/2019
11  10/27/2019    4800  11  None            1  10/25/2019

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM