蟒蛇熊猫按小时计算

Question

I'm working with the following dataset with hourly counts (df): The datframe has 8784 rows (for the year 2016, hourly). 我正在使用以下数据集，每小时计数（df）：数据框有8784行（2016年，每小时）。

I'd like to see if there are daily trends (eg if there is an increase in the morning hours. For this i'd like to create a plot that has the hour of the day (from 0 to 24) on the x-axis and number of cyclists on the y axis (something like in the picture below from http://ofdataandscience.blogspot.co.uk/2013/03/capital-bikeshare-time-series-clustering.html ). 我想看看是否有每日趋势（例如，如果早上时间有所增加。为此我想创建一个在x-上有一天中的小时（从0到24）的情节。轴上和y轴上的骑车者数量（如下图所示http://ofdataandscience.blogspot.co.uk/2013/03/capital-bikeshare-time-series-clustering.html ）。

I experimented with differet ways of pivot , resample and set_index and plotting it with matplotlib, without success. 我尝试了不同的pivot ， resample和set_index方法，并用matplotlib绘制它，但没有成功。 In other words, i couldn't find a way to sum up every observation at a certain hour and then plot those for each weekday 换句话说，我无法找到一种方法来总结某个时刻的每个观察结果，然后为每个工作日绘制那些观察结果

Any ideas how to do this? 任何想法如何做到这一点？ Thanks in advance! 提前致谢！

Answer 1

I think you can use groupby by hour and weekday and aggregate sum (or maybe mean ), last reshape by unstack and DataFrame.plot : 我想你可以使用groupby通过hour和weekday和总sum （或者mean ），由过去的整形unstack和DataFrame.plot ：

df = df.groupby([df['Date'].dt.hour, 'weekday'])['Cyclists'].sum().unstack().plot()

Solution with pivot_table : 使用pivot_table解决方案：

df1 = df.pivot_table(index=df['Date'].dt.hour, 
                     columns='weekday', 
                     values='Cyclists', 
                     aggfunc='sum').plot()

Sample: 样品：

N = 200
np.random.seed(100)
rng = pd.date_range('2016-01-01', periods=N, freq='H')
df = pd.DataFrame({'Date': rng, 'Cyclists': np.random.randint(100, size=N)}) 
df['weekday'] = df['Date'].dt.weekday_name
print (df.head())
   Cyclists                Date weekday
0         8 2016-01-01 00:00:00  Friday
1        24 2016-01-01 01:00:00  Friday
2        67 2016-01-01 02:00:00  Friday
3        87 2016-01-01 03:00:00  Friday
4        79 2016-01-01 04:00:00  Friday

print (df.groupby([df['Date'].dt.hour, 'weekday'])['Cyclists'].sum().unstack())
weekday  Friday  Monday  Saturday  Sunday  Thursday  Tuesday  Wednesday
Date                                                                   
0           102      91       120      53        95       86         21
1           102      83       100      27        20       94         25
2           121      53       105      56        10       98         54
3           164      78        54      30         8       42          6
4           163       0        43      48        89       84         37
5            49      13       150      47        72       95         58
6            24      57        32      39        30       76         39
7           127      76       128      38        12       33         94
8            72       3        59      44        18       58         51
9           138      70        67      18        93       42         30
10           77       3         7      64        92       22         66
11          159      84        49      56        44        0         24
12          156      79        47      34        57       55         55
13           42      10        65      53         0       98         17
14          116      87        61      74        73       19         45
15          106      60        14      17        54       53         89
16           22       3        55      72        92       68         45
17          154      48        71      13        66       62         35
18           60      52        80      30        16       50         16
19           79      43         2      17         5       68         12
20           11      36        94      53        51       35         86
21          180       5        19      68        90       23         82
22          103      71        98      50        34        9         67
23           92      38        63      91        67       48         92

df.groupby([df['Date'].dt.hour, 'weekday'])['Cyclists'].sum().unstack().plot()

EDIT: 编辑：

You can also convert wekkday to categorical for correct soting of columns by names of week: 您还可以将wekkday转换为categorical以便按星期名称正确填写列：

names = [ 'Monday', 'Tuesday', 'Wednesday', 'Thursday','Friday', 'Saturday', 'Sunday']
df['weekday'] = df['weekday'].astype('category', categories=names, ordered=True)
df.groupby([df['Date'].dt.hour, 'weekday'])['Cyclists'].sum().unstack().plot()

蟒蛇熊猫按小时计算

问题描述

1 个解决方案

解决方案1
11 已采纳 2017-04-24 10:54:59

蟒蛇熊猫按小时计算

问题描述

1 个解决方案

解决方案1 11 已采纳 2017-04-24 10:54:59

解决方案1
11 已采纳 2017-04-24 10:54:59