簡體   English   中英

使用Pandas按天間隔對數據系列進行分組

[英]Grouping data series by day intervals with Pandas

我必須按季節執行一些數據分析。

從2015年底到2017年下半年,我大約有一個半年的小時測量值。我要做的是按季節對這些數據進行排序。

這是我正在使用的數據的示例:

Date,Year,Month,Day,Day week,Hour,Holiday,Week Day,Impulse,Power (kW),Temperature (C)
04/12/2015,2015,12,4,6,18,0,6,2968,1781,16.2
04/12/2015,2015,12,4,6,19,0,6,2437,1462,16.2
19/04/2016,2016,4,19,3,3,0,3,1348,809,14.4
19/04/2016,2016,4,19,3,4,0,3,1353,812,14.1
11/06/2016,2016,6,11,7,19,0,7,1395,837,18.8
11/06/2016,2016,6,11,7,20,0,7,1370,822,17.4
11/06/2016,2016,6,11,7,21,0,7,1364,818,17
11/06/2016,2016,6,11,7,22,0,7,1433,860,17.5
04/12/2016,2016,12,4,1,17,0,1,1425,855,14.6
04/12/2016,2016,12,4,1,18,0,1,1466,880,14.4
07/03/2017,2017,3,7,3,14,0,3,3668,2201,14.2
07/03/2017,2017,3,7,3,15,0,3,3666,2200,14
24/04/2017,2017,4,24,2,5,0,2,1347,808,11.4
24/04/2017,2017,4,24,2,6,0,2,1816,1090,11.5
24/04/2017,2017,4,24,2,7,0,2,2918,1751,12.4
15/06/2017,2017,6,15,5,13,1,1,2590,1554,22.5
15/06/2017,2017,6,15,5,14,1,1,2629,1577,22.5
15/06/2017,2017,6,15,5,15,1,1,2656,1594,22.1
15/11/2017,2017,11,15,4,13,0,4,3765,2259,15.6
15/11/2017,2017,11,15,4,14,0,4,3873,2324,15.9
15/11/2017,2017,11,15,4,15,0,4,3905,2343,15.8
15/11/2017,2017,11,15,4,16,0,4,3861,2317,15.3

如您所見,我有三個不同年份的數據。

我當時想做的是使用pd.to_datetime()命令轉換第一列。 然后根據天/月對行進行分組,而不考慮年份(以dd / mm為間隔)(如果冬天從21/12到21/03,則創建一個新的數據框,其中所有行的日期都為包括在此時間間隔中,而與年份無關),但我不能忽略年份來做到這一點(這會使情況變得更加復雜)。

編輯:所需的輸出將是:

df_spring
Date,Year,Month,Day,Day week,Hour,Holiday,Week Day,Impulse,Power (kW),Temperature (C)
19/04/2016,2016,4,19,3,3,0,3,1348,809,14.4
19/04/2016,2016,4,19,3,4,0,3,1353,812,14.1
07/03/2017,2017,3,7,3,14,0,3,3668,2201,14.2
07/03/2017,2017,3,7,3,15,0,3,3666,2200,14
24/04/2017,2017,4,24,2,5,0,2,1347,808,11.4
24/04/2017,2017,4,24,2,6,0,2,1816,1090,11.5
24/04/2017,2017,4,24,2,7,0,2,2918,1751,12.4

df_autumn
Date,Year,Month,Day,Day week,Hour,Holiday,Week Day,Impulse,Power (kW),Temperature (C)
04/12/2015,2015,12,4,6,18,0,6,2968,1781,16.2
04/12/2015,2015,12,4,6,19,0,6,2437,1462,16.2
04/12/2016,2016,12,4,1,17,0,1,1425,855,14.6
04/12/2016,2016,12,4,1,18,0,1,1466,880,14.4
15/11/2017,2017,11,15,4,13,0,4,3765,2259,15.6
15/11/2017,2017,11,15,4,14,0,4,3873,2324,15.9
15/11/2017,2017,11,15,4,15,0,4,3905,2343,15.8
15/11/2017,2017,11,15,4,16,0,4,3861,2317,15.3

剩下的季節依此類推。

使用冬季的“ Day和“ Month列過濾相關行來定義每個季節:

df_winter = df.loc[((df['Day'] >= 21) & (df['Month'] == 12)) | (df['Month'] == 1) | (df['Month'] == 2) | ((df['Day'] <= 21) & (df['Month'] == 3))]

您只需按month.isin()過濾數據month.isin()

# spring
df[df['Month'].isin([3,4])]

    Date    Year    Month   Day Day week    Hour    Holiday Week Day    Impulse Power (kW)  Temperature (C)
2   19/04/2016  2016    4   19  3   3   0   3   1348    809 14.4
3   19/04/2016  2016    4   19  3   4   0   3   1353    812 14.1
10  07/03/2017  2017    3   7   3   14  0   3   3668    2201    14.2
11  07/03/2017  2017    3   7   3   15  0   3   3666    2200    14.0
12  24/04/2017  2017    4   24  2   5   0   2   1347    808 11.4
13  24/04/2017  2017    4   24  2   6   0   2   1816    1090    11.5
14  24/04/2017  2017    4   24  2   7   0   2   2918    1751    12.4


# autumn

df[df['Month'].isin([11,12])]

Date    Year    Month   Day Day week    Hour    Holiday Week Day    Impulse Power (kW)  Temperature (C)
0   04/12/2015  2015    12  4   6   18  0   6   2968    1781    16.2
1   04/12/2015  2015    12  4   6   19  0   6   2437    1462    16.2
8   04/12/2016  2016    12  4   1   17  0   1   1425    855 14.6
9   04/12/2016  2016    12  4   1   18  0   1   1466    880 14.4
18  15/11/2017  2017    11  15  4   13  0   4   3765    2259    15.6
19  15/11/2017  2017    11  15  4   14  0   4   3873    2324    15.9
20  15/11/2017  2017    11  15  4   15  0   4   3905    2343    15.8
21  15/11/2017  2017    11  15  4   16  0   4   3861    2317    15.3

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM