I have a dataframe ( df
) as follows:
d = {'Item':['x','y','z','x','z'], 'Count' : ['10', '11', '12', '9','10'], 'Date' : pd.to_datetime(['2018-8-14', '2018-8-14', '2018-8-14', '2018-8-13','2018-8-13'])}
df= pd.DataFrame(data=d)
Item Count Date
x 10 2018-08-14
y 11 2018-08-14
z 12 2018-08-14
x 9 2018-08-13
x 9 2018-08-12
z 10 2018-08-13
I want to compare rows based on the following: For each item, compare the count of max(Date)
with max(Date) - 1
.
Meaning it should compare the count for item x
, for dates 2018-08-13
and 2018-08-14
. If the count for max(Date)
is greater then it should select that row and store it in a different dataframe.
Same for item z
, it should compare the counts for dates 2018-08-13
and 2018-08-14
and because the count is greater it should select the row for item z
with count 12
.
Output: df2
Item Count Date
x 10 2018-08-14
z 12 2018-08-14
I've tried the following:
if ((df.Item == df.Item) and
(df.Date > df.Date) and (df.Count > df.Count)):
print("we met the conditions!")
Using merge
with key Item
df.loc[df.reset_index().merge(df,on='Item').loc[lambda x : (x['Count_x']>x['Count_y'])&(x['Date_x']>x['Date_y'])]['index'].unique()]
Out[49]:
Item Count Date
0 x 10 2018-08-14
2 z 12 2018-08-14
Thanks to @Wen, I was able to break down his step in to a bit more basic version.
create temporary data set that has values for max(date)
and max(date)-1
t_day = df[df.Date == df.Date.max()]
y_day = df[df.Date == df.Date.max() - pd.to_timedelta(1, unit='d')]
merge temporary dataframes to create a master temp
temp = t_day.merge(y_day, on = 'Item', how='outer')
temp = temp.dropna()
Defining function to create the required condition
def func(row):
if (int(row['Count_x']) > int(row['Count_y']) & (row['Date_x'] > row['Date_y'])):
return '1'
else:
return '0'
temp['cond'] = temp.apply(func, axis=1)
Dropping unused columns
temp.drop(['Count_y','Date_y','cond'],axis = 1, inplace=True)
print(temp)
Now it returns:
Count_x Date_x Item
10 2018-08-14 x
12 2018-08-14 z
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.