I have a data set of of the same category. I want to compare the two date columns of the same category
I want to see if DATE1 less than in values in DATE2 of the same CATEGORY and find the earliest DATE it is greater than
I'm trying this but i'm not getting the results that I am looking for
df['test'] = np.where(m['DATE1'] < df['DATE2'], Y, N)
CATEGORY DATE1 DATE2 GREATERTHAN GREATERDATE
0 23 2015-01-18 2015-01-15 Y 2015-01-10
1 11 2015-02-18 2015-02-19 N 0
2 23 2015-03-18 2015-01-10 Y 2015-01-10
3 11 2015-04-18 2015-08-18 Y 2015-02-19
4 23 2015-05-18 2015-02-21 Y 2015-01-10
5 11 2015-06-18 2015-08-18 Y 2015-02-19
6 15 2015-07-18 2015-02-18 0 0
df['DATE1'] = pd.to_datetime(df['DATE1'])
df['DATE2'] = pd.to_datetime(df['DATE2'])
df['GREATERTHAN'] = np.where(df['DATE1'] > df['DATE2'], 'Y', 'N')
## Getting the earliest date for which data is available, per category
earliest_dates = df.groupby(['CATEGORY']).apply(lambda x: x['DATE1'].append(x['DATE2']).min()).to_frame()
## Merging to get the earliest date column per category
df.merge(earliest_dates, left_on = 'CATEGORY', right_on = earliest_dates.index, how = 'left')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.