简体   繁体   English

熊猫 - 如何分组和绘制每周每天的每个小时

[英]Pandas - How to group-by and plot for each hour of each day of week

I need help figuring out how to plot sub-plots for easy comparison from my dataframe shown: 我需要帮助找出如何绘制子图以便从我显示的数据框中轻松比较:

  Date                   A        B         C              
2017-03-22 15:00:00     obj1    value_a    other_1
2017-03-22 14:00:00     obj2    value_ns   other_5
2017-03-21 15:00:00     obj3    value_kdsa other_23
2014-05-08 17:00:00     obj2    value_as   other_4
2010-07-01 20:00:00     obj1    value_as   other_0

I am trying to graph the occurrences of each hour for each respective day of the week. 我试图绘制每周每个小时的每小时的出现次数。 So count the number of occurrences for each day of the week and hour and plot them on subplots like the ones shown below. 因此,计算一周和每小时中每一天的出现次数,并将其绘制在如下所示的子图上。

在此输入图像描述

If this question sounds confusing please let me know if you have any questions. 如果这个问题听起来很混乱,请告诉我您是否有任何疑问。 Thanks. 谢谢。

You can accomplish this with multiple groupby . 您可以使用多个groupby完成此操作。 Since we know there are 7 days in a week, we can specify that number of panels. 由于我们知道一周有7天,我们可以指定面板数量。 If you groupby(df.Date.dt.dayofweek) , you can use the group index as the index for your subplot axes: 如果你groupby(df.Date.dt.dayofweek) ,你可以使用组索引作为子图轴的索引:

Sample Data 样本数据

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

n = 10000
np.random.seed(123)
df = pd.DataFrame({'Date': pd.date_range('2010-01-01', freq='1.09min', periods=n),
                   'A': np.random.randint(1,10,n),
                   'B': np.random.normal(0,1,n)})

Code: 码:

fig, ax = plt.subplots(ncols=7, figsize=(30,5))
plt.subplots_adjust(wspace=0.05)  #Remove some whitespace between subplots

for idx, gp in df.groupby(df.Date.dt.dayofweek):
    ax[idx].set_title(gp.Date.dt.day_name().iloc[0])  #Set title to the weekday

    (gp.groupby(gp.Date.dt.hour).size().rename_axis('Tweet Hour').to_frame('')
        .reindex(np.arange(0,24,1)).fillna(0)
        .plot(kind='bar', ax=ax[idx], rot=0, ec='k', legend=False))

    # Ticks and labels on leftmost only
    if idx == 0:
        _ = ax[idx].set_ylabel('Counts', fontsize=11)

    _ = ax[idx].tick_params(axis='both', which='major', labelsize=7,
                            labelleft=(idx == 0), left=(idx == 0))

# Consistent bounds between subplots. 
lb, ub = list(zip(*[axis.get_ylim() for axis in ax]))
for axis in ax:
    axis.set_ylim(min(lb), max(ub)) 

plt.show()

在此输入图像描述


If you'd like to make the aspect ratio less extreme, then consider plotting a 4x2 grid. 如果您想使宽高比不那么极端,那么考虑绘制一个4x2网格。 It's a very similar plot as above, once we flatten the axis array. 一旦我们flatten轴阵列,它就像上面一样非常相似。 There's some integer and remainder division to figure out which axes need the labels. 有一些整数和余数除法来确定哪些axes需要标签。

fig, ax = plt.subplots(nrows=2, ncols=4, figsize=(20,10))
fig.delaxes(ax[1,3])  #7 days in a week, remove 8th panel
ax = ax.flatten()  #Far easier to work with a flattened array

lsize=8
plt.subplots_adjust(wspace=0.05, hspace=0.15)  #Remove some whitespace between subplots

for idx, gp in df.groupby(df.Date.dt.dayofweek):
    ax[idx].set_title(gp.Date.dt.day_name().iloc[0])  #Set title to the weekday

    (gp.groupby(gp.Date.dt.hour).size().rename_axis([None]).to_frame()
        .reindex(np.arange(0,24,1)).fillna(0)
        .plot(kind='bar', ax=ax[idx], rot=0, ec='k', legend=False))

    # Titles on correct panels
    if idx%4 == 0:
        _ = ax[idx].set_ylabel('Counts', fontsize=11)
    if (idx//4 == 1) | (idx%4 == 3):
        _ = ax[idx].set_xlabel('Tweet Hour', fontsize=11) 

    # Ticks on correct panels
    _ = ax[idx].tick_params(axis='both', which='major', labelsize=lsize,
                            labelbottom=(idx//4 == 1) | (idx%4 == 3), 
                            bottom=(idx//4 == 1) | (idx%4 == 3),
                            labelleft=(idx%4 == 0), 
                            left=(idx%4 == 0))

# Consistent bounds between subplots. 
lb, ub = list(zip(*[axis.get_ylim() for axis in ax]))
for axis in ax:
    axis.set_ylim(min(lb), max(ub)) 

plt.show()

在此输入图像描述

What about using seaborn ? seaborn怎么seaborn sns.FacetGrid was made for this: sns.FacetGrid就是这样做的:

import pandas as pd
import seaborn as sns

# make some data
date = pd.date_range('today', periods=100, freq='2.5H')

# put in dataframe
df = pd.DataFrame({
    'date' : date
})

# create day_of_week and hour columns
df['dow'] = df.date.dt.day_name()
df['hour'] = df.date.dt.hour

# create facet grid
g = sns.FacetGrid(data=df.groupby([
    'dow',
    'hour'
]).hour.count().to_frame(name='day_hour_count').reset_index(), col='dow', col_order=[
    'Sunday',
    'Monday',
    'Tuesday',
    'Wednesday',
    'Thursday',
    'Friday',
    'Saturday'
], col_wrap=3)

# map barplot to each subplot
g.map(sns.barplot, 'hour', 'day_hour_count');

barplots

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何为数据框中的每个单独类别绘制一周中的每一天 - How to plot for each day of week for each individual category in dataframe 如何根据小时标准每天获取每一组的最小值 - How to get minimum of each group for each day based on hour criteria 遍历熊猫数据框的分组结果,并对每个分组进行操作 - iterating over a group-by result of pandas dataframe and operate on each group 如何对每个大熊猫进行分组并获得最常见的单词和双字母 - How to get group-by and get most frequent words and bigrams for each group pandas 在 Pandas/matplotlib 中绘制一天中每小时的直方图 - Plotting histogram in Pandas/matplotlib for each hour of the day Pandas 每天同一小时的平均值 - Pandas average values for the same hour for each day 如何散点图每组 Pandas DataFrame - How to scatter plot each group of a pandas DataFrame 如何在 SQLAlchemy 中按星期几分组并对星期几的特定列求和? - How do I group by day of week in SQLAlchemy and sum a certain column for each day of week? Python & Pandas - 按天分组并计算每一天 - Python & Pandas - Group by day and count for each day 如何按星期几和一天中的小时过滤 Pandas DatetimeIndex - How to filter a pandas DatetimeIndex by day of week and hour in the day
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM