简体   繁体   English

在 matplotlib 中绘制 y=times(作为数据)与 x=dates:如何格式化所有日期的 y 轴?

[英]Plotting y=times (as data) versus x=dates in matplotlib: How to format the y-axis for all dates?

All the SO answers relating to "plotting times vs. dates" did not help with my problem:所有与“绘图时间与日期”相关的 SO 答案都对我的问题没有帮助:
Given two series of datetime data, I want to plot the time part versus the date part.给定两个系列的日期时间数据,我想 plot 时间部分与日期部分。

Using the code below, I get the desired formatting, but only for the first point (of both series).使用下面的代码,我得到了所需的格式,但仅限于第一点(两个系列)。
I can't figure out how to define the y-axis limits (or format) that would work for all dates.我不知道如何定义适用于所有日期的 y 轴限制(或格式)。

Thanks in advance for pointing out my mistake(s)!提前感谢您指出我的错误!

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import matplotlib.dates as mdates

data = np.array(
    [( 0, '2021-04-14T07:45:00.000000000', '2021-04-14T22:42:00.000000000'),
     ( 1, '2021-04-15T06:37:00.000000000', '2021-04-15T23:20:00.000000000'),
     ( 2, '2021-04-16T06:45:00.000000000', '2021-04-16T22:45:00.000000000'),
     ( 3, '2021-04-17T06:35:00.000000000', '2021-04-17T23:01:00.000000000'),
     ( 4, '2021-04-18T06:30:00.000000000', '2021-04-18T22:50:00.000000000'),
     ( 5, '2021-04-19T06:14:00.000000000', '2021-04-19T23:05:00.000000000'),
     ( 6, '2021-04-20T07:10:00.000000000', '2021-04-21T00:00:00.000000000'),
     ( 7, '2021-04-21T06:37:00.000000000', '2021-04-21T22:30:00.000000000'),
     ( 8, '2021-04-22T07:25:00.000000000', '2021-04-22T23:40:00.000000000'),
     ( 9, '2021-04-23T06:24:00.000000000', '2021-04-23T23:45:00.000000000')],
     dtype=[('index', '<i8'), ('up_dt', '<M8[ns]'), ('down_dt', '<M8[ns]')])

# the actual data comes from a csv, so I use pd for further manipulations
df = pd.DataFrame.from_records(data)


fig, ax1 = plt.subplots(figsize=(9,7))

line1 = ax1.plot(df.up_dt.dt.date, df.up_dt,
                 label='rise time',
                 marker='^', linewidth=0, color='b')
line2 = ax1.plot(df.down_dt.dt.date, df.down_dt,
                 label='bed time',
                 marker='v', linewidth=0, color='b')

# x_axis ranges & format:
ax1.set_xlim([df.up_dt.min()-pd.DateOffset(days=1),
              df.up_dt.max()+pd.DateOffset(days=1)]);
ax1.xaxis.set_major_locator(mdates.DayLocator(interval=1))
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))

# y_axis ranges & format:
d1 = str(df.up_dt.min().date())+' 00:00'
d2 = str(df.up_dt.min().date())+' 23:59'
y_time = pd.date_range(start=d1, end=d2,freq='H')
ax1.set_ylim([y_time.min(), y_time.max()])

ax1.yaxis.set_major_locator(mdates.HourLocator(byhour=range(24), interval=2))
ax1.yaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))

ax1.set_ylabel('Clock Hours')
fig.autofmt_xdate()

VER = """python: 3.7.6 [Windows 10, MSC v.1916 64 bit (AMD64)]
pandas: 1.2.3; numpy: 1.19.2; matplotlib: 3.3.4"""

plt.title(F'Missing data if y-limit set with "df.<series1>.min()":\n{VER}')

plt.show();

Output: Output:

在此处输入图像描述

If you take out the line where you set the y limits, you see that the y values are datetimes, not just times.如果您取出设置 y 限制的行,您会看到 y 值是日期时间,而不仅仅是时间。 So you have a couple of choices:所以你有几个选择:

1) Set all the dates the same date for the time column. 1)将所有日期设置为时间列的相同日期。

You are hiding the date on the y axis, so this is the quickest replacement in your code.您将日期隐藏在 y 轴上,因此这是代码中最快的替换。 The snippet below only shows the added or changed lines from your code block.下面的代码片段仅显示代码块中添加或更改的行。

import datetime
...
# make a new column with one date and the time from up_dt
df["up_time"] = df.up_dt.apply(lambda d: datetime.datetime(2021, 1, 1, d.hour, d.minute, d.second, d.microsecond))
...
# plot using the new time column for the y values
line1 = ax1.plot(df.up_dt.dt.date, df.up_time,
                 label='rise time',
                 marker='^', linewidth=0, color='b')
...
# use the new time column when finding the y limits
d1 = str(df.up_time.min().date())+' 00:00'
d2 = str(df.up_time.min().date())+' 23:59'

2) Use decimal representation of time 2) 使用时间的十进制表示

If you want to strip off just the time portion and not use a fake date for a place-holder, you need to convert it to a number, because time objects are not treated numerically by matplotlib, or for the standard datetime package.如果您只想去掉时间部分而不使用假日期作为占位符,则需要将其转换为数字,因为 matplotlib 或标准日期时间 package 不会以数字方式处理时间对象。 Using the pendulum package we can convert a time to decimal representation, here I am converting to hours since midnight.使用摆锤 package 我们可以将时间转换为十进制表示,这里我将转换为自午夜以来的小时数。 You can replace the tick labels with the clock representation strings.您可以用时钟表示字符串替换刻度标签。

import pendulum as pend
...
# make separate up and time columns
df["up_date"] = df.up_dt.apply(lambda d: d.date())
df["up_time_hr"] = df.up_dt.apply(lambda d: (pend.parse(d.isoformat()).time() - pend.time(0)).seconds/3600)

# plot time vs date using the new columns
fig, ax1 = plt.subplots(figsize=(9,7))
line1 = ax1.plot(df.up_date, df.up_time_hr)

# example of setting tick labels
ax1.set_yticks([ 6.5, 7, 7.5])
ax1.set_yticklabels(["6:30", "7:00", "7:30"])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM