簡體   English   中英

如何在 pandas DATE 列上創建 pivot 表並計算時間差?

[英]How to create a pivot table on pandas DATE column and calculate time difference?

我有以下Dataframe,

D_DATE       BIN Number   Disposition    Unit Assigned        
2018-01-04    10005      SWO Issued      PLUMBING DIVISION     
2016-06-23    10005      SWO Issued      SCAFFOLD UNIT         
2016-06-23    10005      SWO Rescinded   SCAFFOLD UNIT         
2018-01-17    10005      SWO Rescinded   PLUMBING DIVISION  
2019-01-04    10006      SWO Rescinded   BEST SQUAD 
2018-12-21    10006      SWO Issued      BEST SQUAD 
2020-02-10    10006      SWO Issued      BEST SQUAD
2020-02-25    10006      SWO Rescinded   BEST SQUAD

df = pd.DataFrame({'D_DATE':['2018-01-04','2016-06-23','2016-06-23','2018-01-17','2019-01-04','2018-12-21','2020-02-10','2020-02-25'],
                    'BIN Number': ['10005', '10005', '10005', '10005', '10006','10006','10006','10006] ,
                   'Disposition': ['SWO Issued', 'SWO Issued', 'SWO Rescinded', 'SWO Rescinded','SWO Rescinded','SWO Issued','SWO Issued','SWO Rescinded'] ,
                   'Unit Assigned': ['PLUMBING DIVISION', 'SCAFFOLD UNIT', 'SCAFFOLD UNIT', 'PLUMBING DIVISION','BEST SQUAD','BEST SQUAD','BEST SQUAD','BEST SQUAD']})

如果可能,我想創建一個 pivot 表,以便我有兩列日期,一列用於發布數據,另一列用於撤銷日期,但在 pivot 中我需要維護該單元,所以我應該以三個結尾列:

單位、簽發日期、撤銷日期

然后接下來我想計算發布日期和撤銷日期之間的時間差。

Output:

Unit Assigned      SWO Issued     SWO Rescinded    Time Difference
PLUMBING DIVISION  2018-01-04     2018-01-17        13 days
SCAFFOLD UNIT      2016-06-23     2016-06-23        0 days
BEST SQUAD         2018-12-21     2019-01-04        14 days
BEST SQUAD         2020-02-10     2020-02-25        15 days

感謝任何幫助。 謝謝。

我相信這是pivot/pivot_table

# convert to datetime if not already is
df['D_DATE'] = pd.to_datetime(df['D_DATE'])

(df.assign(idx=df.groupby(['BIN Number', 'Disposition','Unit Assigned']).cumcount())
   .pivot_table(index=['idx','BIN Number', 'Unit Assigned'], 
                columns='Disposition', 
                values='D_DATE',
                aggfunc='first')
   .reset_index()
   .assign(Time_Different=lambda x: x['SWO Rescinded'] - x['SWO Issued'])
   .drop('idx',axis=1)

)

Output:

Disposition     BIN Number  Unit Assigned       SWO Issued  SWO Rescinded   Time_Different
0               10005       PLUMBING DIVISION   2018-01-04  2018-01-17      13 days 
1               10005       SCAFFOLD UNIT       2016-06-23  2016-06-23      0 days
2               10006       BEST SQUAD          2018-12-21  2019-01-04      14 days
3               10006       BEST SQUAD          2020-02-10  2020-02-25      15 days

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM