简体   繁体   中英

How can I set two-level ticks for Pandas hour/weekday plot?

I have a DataFrame which is structurally similar to the following:

from datetime import datetime
import pandas as pd
from mpu.datetime import generate  # pip install mpu

mind, maxd = datetime(2018, 1, 1), datetime(2018, 12, 30)
df = pd.DataFrame({'datetime': [generate(mind, maxd) for _ in range(10)]})

I want to understand how this data is distributed over hours of the day and days of the week. I can get them via:

df['weekday'] = df['datetime'].dt.weekday
df['hour'] = df['datetime'].dt.hour

And finally I have the plot:

ax = df.groupby(['weekday', 'hour'])['datetime'].count().plot(kind='line', color='blue')
ax.set_ylabel("#")
ax.set_xlabel("time")
plt.show()

which gives me:

在此处输入图片说明

But you can notice that it is hard to distinguish the weekdays and the hours are not even noticeable. How can I get two-level labels similar to the following?

在此处输入图片说明

If you assume that every possible weekday and hour actually appears in the data, the axis units will simply be hours, with Monday midnight being 0, and Sunday 23h being 24*7-1 = 167. You can then tick every 24 hours with major ticks and label every noon with the respective day of the week.

import numpy as np; np.random.seed(42)
import datetime as dt
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator, FuncFormatter, NullFormatter

# Generate example data
N = 5030
delta = (dt.datetime(2019, 1, 1) - dt.datetime(2018, 1, 1)).total_seconds()
df = pd.DataFrame({'datetime': np.array("2018-01-01").astype(np.datetime64) + 
                               (delta*np.random.rand(N)).astype(np.timedelta64)})

# Group the data
df['weekday'] = df['datetime'].dt.weekday
df['hour'] = df['datetime'].dt.hour

counts = df.groupby(['weekday', 'hour'])['datetime'].count()

ax = counts.plot(kind='line', color='blue')
ax.set_ylabel("#")
ax.set_xlabel("time")
ax.grid()
# Now we assume that there is data for every hour and day present
assert len(counts) == 7*24
# Hence we can tick the axis with multiples of 24h
ax.xaxis.set_major_locator(MultipleLocator(24))
ax.xaxis.set_minor_locator(MultipleLocator(1))

days = ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"]
def tick(x,pos):
    if x % 24 == 12:
        return days[int(x)//24]
    else:
        return ""
ax.xaxis.set_major_formatter(NullFormatter())
ax.xaxis.set_minor_formatter(FuncFormatter(tick))
ax.tick_params(which="major", axis="x", length=10, width=1.5)
plt.show()

在此处输入图片说明

It is not exactly the visualization you mentioned, but an idea would be to unstack your pandas time series and then plot.

df.groupby(['weekday', 'hour'])['datetime'].count().unstack(level=0).plot()

The result would be the following with the data you provided on your code is:

在此处输入图片说明

I was not able to test it with your dataset, and pandas datetime is sometimes difficult with matplotlib datetime. But the idea is to set major and minor ticks and define their grid qualities separately:

import pandas as pd
from matplotlib import pyplot as plt
from matplotlib import dates as mdates

#create sample data and plot it
from io import StringIO
data = StringIO("""
X,A,B
2018-11-21T12:04:20,1,8
2018-11-21T18:14:17,6,7
2018-11-22T02:18:21,8,14
2018-11-22T12:31:54,7,8
2018-11-22T20:33:20,5,5
2018-11-23T12:23:12,13,2
2018-11-23T21:31:05,7,12
""")
df = pd.read_csv(data, parse_dates = True, index_col = "X")
ax=df.plot()

#format major locator
ax.xaxis.set_major_locator(mdates.DayLocator())
#format minor locator with specific hours
ax.xaxis.set_minor_locator(mdates.HourLocator(byhour = [8, 12, 18]))
#label major ticks
ax.xaxis.set_major_formatter(mdates.DateFormatter('%a %d %m'))
#label minor ticks
ax.xaxis.set_minor_formatter(mdates.DateFormatter("%H:00"))
#set grid for major ticks
ax.grid(which = "major", axis = "x", linestyle = "-", linewidth = 2)
#set grid for minor ticks with different properties
ax.grid(which = "minor", axis = "x", linestyle = "--", linewidth = 1)

plt.show()

Sample output: 在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM