[英]How to plot for each day of week for each individual category in dataframe
[英]Pandas - How to group-by and plot for each hour of each day of week
我需要帮助找出如何绘制子图以便从我显示的数据框中轻松比较:
Date A B C
2017-03-22 15:00:00 obj1 value_a other_1
2017-03-22 14:00:00 obj2 value_ns other_5
2017-03-21 15:00:00 obj3 value_kdsa other_23
2014-05-08 17:00:00 obj2 value_as other_4
2010-07-01 20:00:00 obj1 value_as other_0
我试图绘制每周每个小时的每小时的出现次数。 因此,计算一周和每小时中每一天的出现次数,并将其绘制在如下所示的子图上。
如果这个问题听起来很混乱,请告诉我您是否有任何疑问。 谢谢。
您可以使用多个groupby
完成此操作。 由于我们知道一周有7天,我们可以指定面板数量。 如果你groupby(df.Date.dt.dayofweek)
,你可以使用组索引作为子图轴的索引:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
n = 10000
np.random.seed(123)
df = pd.DataFrame({'Date': pd.date_range('2010-01-01', freq='1.09min', periods=n),
'A': np.random.randint(1,10,n),
'B': np.random.normal(0,1,n)})
fig, ax = plt.subplots(ncols=7, figsize=(30,5))
plt.subplots_adjust(wspace=0.05) #Remove some whitespace between subplots
for idx, gp in df.groupby(df.Date.dt.dayofweek):
ax[idx].set_title(gp.Date.dt.day_name().iloc[0]) #Set title to the weekday
(gp.groupby(gp.Date.dt.hour).size().rename_axis('Tweet Hour').to_frame('')
.reindex(np.arange(0,24,1)).fillna(0)
.plot(kind='bar', ax=ax[idx], rot=0, ec='k', legend=False))
# Ticks and labels on leftmost only
if idx == 0:
_ = ax[idx].set_ylabel('Counts', fontsize=11)
_ = ax[idx].tick_params(axis='both', which='major', labelsize=7,
labelleft=(idx == 0), left=(idx == 0))
# Consistent bounds between subplots.
lb, ub = list(zip(*[axis.get_ylim() for axis in ax]))
for axis in ax:
axis.set_ylim(min(lb), max(ub))
plt.show()
如果您想使宽高比不那么极端,那么考虑绘制一个4x2网格。 一旦我们flatten
轴阵列,它就像上面一样非常相似。 有一些整数和余数除法来确定哪些axes
需要标签。
fig, ax = plt.subplots(nrows=2, ncols=4, figsize=(20,10))
fig.delaxes(ax[1,3]) #7 days in a week, remove 8th panel
ax = ax.flatten() #Far easier to work with a flattened array
lsize=8
plt.subplots_adjust(wspace=0.05, hspace=0.15) #Remove some whitespace between subplots
for idx, gp in df.groupby(df.Date.dt.dayofweek):
ax[idx].set_title(gp.Date.dt.day_name().iloc[0]) #Set title to the weekday
(gp.groupby(gp.Date.dt.hour).size().rename_axis([None]).to_frame()
.reindex(np.arange(0,24,1)).fillna(0)
.plot(kind='bar', ax=ax[idx], rot=0, ec='k', legend=False))
# Titles on correct panels
if idx%4 == 0:
_ = ax[idx].set_ylabel('Counts', fontsize=11)
if (idx//4 == 1) | (idx%4 == 3):
_ = ax[idx].set_xlabel('Tweet Hour', fontsize=11)
# Ticks on correct panels
_ = ax[idx].tick_params(axis='both', which='major', labelsize=lsize,
labelbottom=(idx//4 == 1) | (idx%4 == 3),
bottom=(idx//4 == 1) | (idx%4 == 3),
labelleft=(idx%4 == 0),
left=(idx%4 == 0))
# Consistent bounds between subplots.
lb, ub = list(zip(*[axis.get_ylim() for axis in ax]))
for axis in ax:
axis.set_ylim(min(lb), max(ub))
plt.show()
用seaborn
怎么seaborn
? sns.FacetGrid
就是这样做的:
import pandas as pd
import seaborn as sns
# make some data
date = pd.date_range('today', periods=100, freq='2.5H')
# put in dataframe
df = pd.DataFrame({
'date' : date
})
# create day_of_week and hour columns
df['dow'] = df.date.dt.day_name()
df['hour'] = df.date.dt.hour
# create facet grid
g = sns.FacetGrid(data=df.groupby([
'dow',
'hour'
]).hour.count().to_frame(name='day_hour_count').reset_index(), col='dow', col_order=[
'Sunday',
'Monday',
'Tuesday',
'Wednesday',
'Thursday',
'Friday',
'Saturday'
], col_wrap=3)
# map barplot to each subplot
g.map(sns.barplot, 'hour', 'day_hour_count');
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.