简体   繁体   English

在 Pandas/matplotlib 中绘制一天中每小时的直方图

[英]Plotting histogram in Pandas/matplotlib for each hour of the day

I have a timeseries dataset of hourly granularity showing the price return for an asset during each hour of the day for a year.我有一个每小时粒度的时间序列数据集,显示一年中每天每小时的资产价格回报。

I am trying to plot a distribution of returns for each hour of the day and thought there may be a way to group the returns by each hour and then to plot a histogram for each hour - so the output of the loop/function/method (ie the goal I am aiming to achieve) would be 24 distribution plot showing the distribution of returns for each hour over of the day across the entire time period.我正在尝试绘制一天中每个小时的收益分布,并认为可能有一种方法可以按小时对收益进行分组,然后绘制每小时的直方图 - 因此循环/函数/方法的输出(即我要实现的目标)将是 24 个分布图,显示整个时间段内一天中每个小时的收益分布。

My current dataframe is multi-indexed as Day , Hour (this may not be correct to accomplish my goal, and I can change it if required).我当前的数据帧是多索引的DayHour (这可能不正确实现我的目标,如果需要,我可以更改它)。

I am able to use groupby to get an hourly average of return over the timeframe ( df.groupby("Hour").mean() ) and thought I could use a similar method to plot my distributions.我能够使用groupby获得时间范围内的每小时平均回报( df.groupby("Hour").mean() ),并认为我可以使用类似的方法来绘制我的分布。

Any suggestions of how to accomplish this goal would be appreciated任何关于如何实现这一目标的建议将不胜感激

Example with very simple data (just 9-10 and 10-11 AM for 3 days):具有非常简单数据的示例(仅 3 天的上午 9-10 点和上午 10-11 点):

import pandas as pd
import matplotlib.pyplot as plt

arrays = [[1, 1, 2, 2, 3, 3], [9, 10, 9, 10, 9, 10]]
ind = pd.MultiIndex.from_arrays(arrays, names=['Day', 'Hour'])
df = pd.DataFrame({'Price': [34, 35, 37, 31, 30, 29]}, index=ind)

for hour, group in df.groupby("Hour"):
    group.plot(kind='hist', bins=10, title=f'{hour}:00 - {hour+1}:00')
    plt.show()

在此处输入图片说明


Yes, groupby already precisely does what you want -- binning a DataFrame into groups.是的, groupby已经完全符合您的要求——将 DataFrame 分组。 You can then easily go over those groups and plot each one individually.然后,您可以轻松地查看这些组并单独绘制每个组。 (In fact, for me calling .mean() on the list of groups is the less intuitive thing to do.). (事实上​​,对我来说,在组列表上调用.mean()是不太直观的事情。)。 This is the output for printing each groupby group:这是打印每个 groupby 组的输出:

          Price
Day Hour       
1   9        34
    10       35
2   9        37
    10       31
3   9        30
    10       29

(9,           Price
Day Hour       
1   9        34
2   9        37
3   9        30)
(10,           Price
Day Hour       
1   10       35
2   10       31
3   10       29)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM