根據條件比較熊貓數據框行

Question

我有一個數據框（ df ）如下：

d = {'Item':['x','y','z','x','z'], 'Count' : ['10', '11', '12', '9','10'], 'Date' : pd.to_datetime(['2018-8-14', '2018-8-14', '2018-8-14', '2018-8-13','2018-8-13'])}

df= pd.DataFrame(data=d)


Item       Count        Date
x          10           2018-08-14
y          11           2018-08-14
z          12           2018-08-14
x          9            2018-08-13
x          9            2018-08-12
z          10           2018-08-13

我想根據以下內容比較行：對於每個項目，將max(Date)的計數與max(Date) - 1 。

意味着它應該比較日期為2018-08-13和2018-08-14項x的計數。 如果max(Date)的計數較大，則應選擇該行並將其存儲在其他數據框中。

與項目z相同，它應比較日期2018-08-13和2018-08-14的計數，並且由於計數較大，因此應為項目z選擇計數為12 。

輸出：df2

Item     Count     Date
x        10        2018-08-14
z        12        2018-08-14

我嘗試了以下方法：

if ((df.Item == df.Item) and
        (df.Date > df.Date) and (df.Count > df.Count)):
    print("we met the conditions!")

Answer 1

使用merge關鍵Item

df.loc[df.reset_index().merge(df,on='Item').loc[lambda x : (x['Count_x']>x['Count_y'])&(x['Date_x']>x['Date_y'])]['index'].unique()]
Out[49]: 
  Item  Count       Date
0    x     10 2018-08-14
2    z     12 2018-08-14

Answer 2

感謝@Wen，我得以將他的工作分解為更基本的版本。

創建具有max(date)和max(date)-1值的臨時數據集

t_day = df[df.Date == df.Date.max()]
y_day = df[df.Date == df.Date.max() - pd.to_timedelta(1, unit='d')]

合並臨時數據幀以創建主臨時文件

temp = t_day.merge(y_day, on = 'Item', how='outer')
temp = temp.dropna()

定義功能以創建所需條件

def func(row):
    if (int(row['Count_x']) > int(row['Count_y']) & (row['Date_x'] > row['Date_y'])):
        return '1'
    else:
        return '0'
temp['cond'] = temp.apply(func, axis=1)

刪除未使用的列

temp.drop(['Count_y','Date_y','cond'],axis = 1, inplace=True)

print(temp)

現在返回：

Count_x      Date_x     Item   
10         2018-08-14    x     
12         2018-08-14    z

根據條件比較熊貓數據框行

問題描述

2 個解決方案

解決方案1
1 已采納 2018-08-14 15:37:30

解決方案2
0 2018-08-14 18:18:37

根據條件比較熊貓數據框行

問題描述

2 個解決方案

解決方案1 1 已采納 2018-08-14 15:37:30

解決方案2 0 2018-08-14 18:18:37

解決方案1
1 已采納 2018-08-14 15:37:30

解決方案2
0 2018-08-14 18:18:37