Input Data:
Period Invoice_No Amount Date
Mar-19 123 5000.00 12-02-2019
Mar-19 124 5000.00 13-02-2019
Mar-19 125 7000.00 13-02-2019
Mar-19 126 5000.00 19-02-2019
Mar-19 127 5000.00 25-02-2019
Mar-19 128 5000.00 27-02-2019
So basically want to check if first row amount (5000.00) is equal to second row amount; then perform a date difference function (13-02-2019 "-" 12-02-2019) and if difference is less than "5 days" then the following is the output. If difference is more than 5 days exclude from the output (Line number 4)
Period Invoice_No Amount Date
Mar-19 123 5000.00 12-02-2019
Mar-19 124 5000.00 13-02-2019
Mar-19 127 5000.00 25-02-2019
Mar-19 128 5000.00 27-02-2019
You can shift the date to calculate the time delta between the rows. After I filter for the conditions you defined.
df['Date_diff'] = df['Date'].shift(-1)
df['Date_diff'] = df.apply(lambda x: pd.Timedelta(x['Date_diff'] - x['Date']),axis=1)
mask = (df['Amount'] == df['Amount'].shift(-1)) & (df['Date_diff'] < pd.Timedelta('5 days'))
mask2 = (df['Amount'] == df['Amount'].shift(1)) & (df['Date_diff'].shift(1) < pd.Timedelta('5 days'))
print(df[mask | mask2].drop('Date_diff', axis=1))
Period Invoice_No Amount Date
0 Feb-19 123 5000.0 2019-02-12
1 Feb-19 123 5000.0 2019-02-13
IICU:
df.sort_values(by=['Period','Date'], inplace=True)#Sortby Date
df['Date']=pd.to_datetime(df['Date'])
df['days']=df['Date'].diff().dt.days#Find days difference
df['Amountdiff']=df['Amount'].diff()#Find amount difference
df.fillna(0, inplace=True)#Fill any NaN/NaT
df2=df[df.Amountdiff==0]
df2.iloc[:,:4:]
Outcome
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.