简体   繁体   中英

Plot histogram of epoch list, x axis by month-year in PyPlot

With a list of epoch dates, is there a parameter in pyplot or numpy to have an histogram where the bins match the months in the data list? In this example, the list correspond to random date from 2012 to 2013. I would like that the histogram shows the bars from, for example, February 2012 to October 2013 if the values in data correspond only to dates from these months.

This code makes an histogram, but it separates manually for bins=24 .

import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import random

data = [int(random.randint(1293836400, 1356994800)) for _ in range(1000)]

# convert the epoch format to matplotlib date format
mpl_data = mdates.epoch2num(data)

fig, ax = plt.subplots(1,1)
ax.hist(mpl_data, bins=24, ec='black')
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d.%m.%y'))
fig.autofmt_xdate()
plt.show()

In order to do this you have to pick out the timestamps at which the beginning of each month begins. Dates/Times are always a lot trickier than just regular numbers so while this code looks a bit cumbersome, it does work.

import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import random

data = [int(random.randint(1293836400, 1356994800)) for _ in range(1000)]

# create your bins as timestamps marked at the beginning of each month, using datetime objects to increment
import datetime as d
mindate = d.datetime.fromtimestamp(min(data))
maxdate = d.datetime.fromtimestamp(max(data))
bindate = d.datetime(year=mindate.year, month=mindate.month, day=1)
bins = [bindate.timestamp()]
while bindate < maxdate:
    if bindate.month == 12:
        bindate = d.datetime(year=bindate.year + 1, month=1, day=1)
    else:
        bindate = d.datetime(year=bindate.year, month=bindate.month + 1, day=1)
    bins.append(bindate.timestamp())
bins = mdates.epoch2num(bins)

mpl_data = mdates.epoch2num(data)
fig, ax = plt.subplots(1,1, figsize=(16, 4), facecolor='white')
ax.hist(mpl_data, bins=bins, ec='black')
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d.%m.%y'))
fig.autofmt_xdate()

在此输入图像描述

Another approach is to use pandas to group data by month, and then counting them. The code is much shorter, and you can make a quick bar plot. To re-create your plot above would take more work, but this gives you a feel for things you can do with other tools:

srs = pd.DatetimeIndex(pd.Series(data) * 1e9)  # convert sec to nsec
df = pd.DataFrame({'count': np.ones(shape=len(srs))}, index=srs)
fig, ax = plt.subplots(1, 1, figsize=(16,4), facecolor='white')
df.groupby(pd.Grouper(freq='M')).count().plot.bar(ax=ax)

在此输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM