简体   繁体   中英

Python, most efficient way to subplot pandas data frame

I am trying to plot a subplot of 9 (in this example but the number would be variable for other use cases) line graphs showing the count of data points by county/area.

So far I have:

Surrey1 = df[df.county == 'Surrey']
Surrey2 = Surrey1.county.groupby(df.date_stamp).value_counts()


East_Sussex1 = df[df.county == 'East Sussex']
East_Sussex2 = East_Sussex1.county.groupby(df.date_stamp).value_counts()

West_Sussex1 = df[df.county == 'West Sussex']
West_Sussex2 = West_Sussex1.county.groupby(df.date_stamp).value_counts()


Buck1 = df[df.county == 'Buckinghamshire']
Buck2 = Buck1.county.groupby(df.date_stamp).value_counts()


Norfolk1 = df[df.county == 'Norfolk']
Norfolk2 = Norfolk1.county.groupby(df.date_stamp).value_counts()


Suffolk1 = df[df.county == 'Suffolk']
Suffolk2 = Suffolk1.county.groupby(df.date_stamp).value_counts()

Essex1 = df[df.county == 'Essex']
Essex2 = Essex1.county.groupby(df.date_stamp).value_counts()

Kent1 = df[df.county == 'Kent']
Kent2 = Kent1.county.groupby(df.date_stamp).value_counts()

# Create the fig
fig, axes = plt.subplots(nrows=8, ncols=1, figsize=(12,6))

# Now plot
pd1_N.plot(ax = axes[0], subplots=True, legend=False) 
pd2_S.plot(ax = axes[1], subplots=True, legend=False)
pd3_ES.plot(ax = axes[2], subplots=True, legend=False)
pd4_WS.plot(ax = axes[3], subplots=True, legend=False)
pd5_B.plot(ax = axes[4], subplots=True, legend=False)
pd6_S.plot(ax = axes[5], subplots=True, legend=False)
pd7_E.plot(ax = axes[6], subplots=True, legend=False)
pd8_K.plot(ax = axes[7], subplots=True, legend=False)

Which produces:

在此处输入图片说明

Is there a quicker/more efficient way to do this? Tips on how to make the graph a little more presentable would be appreciated as well! Update:

I'm now using the a very simple function to do this quicker for a variable metric:

def plot_freq(metric, graph_width, graph_height):
    plot = str(metric)
    df.groupby(plot)['date_stamp'].value_counts().unstack(0).plot(subplots=True, figsize=(graph_width, graph_height))
    print("This plot shows the number of data points by", metric)

在此处输入图片说明

I think you can condense this down to the following:

df = pd.DataFrame({'Date':np.random.choice(pd.date_range('2017-10-01','2017-10-10',freq='D'), 500),'county':np.random.choice(['East Sussex','Buckinghamshire','Kent','Essex','Essex'],500)})

df.groupby('county')['Date'].value_counts().unstack(0).plot(subplots=True)

Output:

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM