I have a DataFrame which is structurally similar to the following:
from datetime import datetime
import pandas as pd
from mpu.datetime import generate # pip install mpu
mind, maxd = datetime(2018, 1, 1), datetime(2018, 12, 30)
df = pd.DataFrame({'datetime': [generate(mind, maxd) for _ in range(10)]})
I want to understand how this data is distributed over hours of the day and days of the week. I can get them via:
df['weekday'] = df['datetime'].dt.weekday
df['hour'] = df['datetime'].dt.hour
And finally I have the plot:
ax = df.groupby(['weekday', 'hour'])['datetime'].count().plot(kind='line', color='blue')
ax.set_ylabel("#")
ax.set_xlabel("time")
plt.show()
which gives me:
But you can notice that it is hard to distinguish the weekdays and the hours are not even noticeable. How can I get two-level labels similar to the following?
If you assume that every possible weekday and hour actually appears in the data, the axis units will simply be hours, with Monday midnight being 0, and Sunday 23h being 24*7-1 = 167. You can then tick every 24 hours with major ticks and label every noon with the respective day of the week.
import numpy as np; np.random.seed(42)
import datetime as dt
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator, FuncFormatter, NullFormatter
# Generate example data
N = 5030
delta = (dt.datetime(2019, 1, 1) - dt.datetime(2018, 1, 1)).total_seconds()
df = pd.DataFrame({'datetime': np.array("2018-01-01").astype(np.datetime64) +
(delta*np.random.rand(N)).astype(np.timedelta64)})
# Group the data
df['weekday'] = df['datetime'].dt.weekday
df['hour'] = df['datetime'].dt.hour
counts = df.groupby(['weekday', 'hour'])['datetime'].count()
ax = counts.plot(kind='line', color='blue')
ax.set_ylabel("#")
ax.set_xlabel("time")
ax.grid()
# Now we assume that there is data for every hour and day present
assert len(counts) == 7*24
# Hence we can tick the axis with multiples of 24h
ax.xaxis.set_major_locator(MultipleLocator(24))
ax.xaxis.set_minor_locator(MultipleLocator(1))
days = ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"]
def tick(x,pos):
if x % 24 == 12:
return days[int(x)//24]
else:
return ""
ax.xaxis.set_major_formatter(NullFormatter())
ax.xaxis.set_minor_formatter(FuncFormatter(tick))
ax.tick_params(which="major", axis="x", length=10, width=1.5)
plt.show()
I was not able to test it with your dataset, and pandas datetime is sometimes difficult with matplotlib datetime. But the idea is to set major and minor ticks and define their grid qualities separately:
import pandas as pd
from matplotlib import pyplot as plt
from matplotlib import dates as mdates
#create sample data and plot it
from io import StringIO
data = StringIO("""
X,A,B
2018-11-21T12:04:20,1,8
2018-11-21T18:14:17,6,7
2018-11-22T02:18:21,8,14
2018-11-22T12:31:54,7,8
2018-11-22T20:33:20,5,5
2018-11-23T12:23:12,13,2
2018-11-23T21:31:05,7,12
""")
df = pd.read_csv(data, parse_dates = True, index_col = "X")
ax=df.plot()
#format major locator
ax.xaxis.set_major_locator(mdates.DayLocator())
#format minor locator with specific hours
ax.xaxis.set_minor_locator(mdates.HourLocator(byhour = [8, 12, 18]))
#label major ticks
ax.xaxis.set_major_formatter(mdates.DateFormatter('%a %d %m'))
#label minor ticks
ax.xaxis.set_minor_formatter(mdates.DateFormatter("%H:00"))
#set grid for major ticks
ax.grid(which = "major", axis = "x", linestyle = "-", linewidth = 2)
#set grid for minor ticks with different properties
ax.grid(which = "minor", axis = "x", linestyle = "--", linewidth = 1)
plt.show()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.