简体   繁体   English

pandas DataFrame plot - 无法为 timedelta 值设置 xtick 间隔

[英]pandas DataFrame plot - impossible to set xtick intervals for timedelta values

I am trying to specify the x-axis interval when plotting DataFrames.我试图在绘制 DataFrame 时指定 x 轴间隔。 I have several data files like,我有几个数据文件,例如,

0:0:0 29
0:5:0 85
0:10:0 141
0:15:0 198
0:20:0 251
0:25:0 308
0:30:0 363
0:35:0 413 

Where first column is time in %H:%M:%S format but hours goes beyond 24 hours (till 48 hours).其中第一列是 %H:%M:%S 格式的时间,但小时数超过 24 小时(直到 48 小时)。

When I read the file as below and plot it looks fine but I want to set the xticks interval to 8 hours.当我阅读如下文件和 plot 时,它看起来不错,但我想将 xticks 间隔设置为 8 小时。

df0 = pd.read_csv(fil, names=['Time', 'Count'], delim_whitespace=True, parse_dates=['Time'])
df0 = df0.set_index('Time')
ax = matplotlib.pyplot.gca()
mkfunc = lambda x, pos: '%1.1fM' % (x * 1e-6) if x >= 1e6 else '%1.1fK' % (x * 1e-3) if x >= 1e3 else '%1.1f' % x
mkformatter = matplotlib.ticker.FuncFormatter(mkfunc)
ax.yaxis.set_major_formatter(mkformatter)

ax.xaxis.set_major_locator(mdates.HourLocator(interval=8))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H'))

df0.plot(ax=ax, x_compat=True, color='blue')
plt.grid()
plt.savefig('figure2.pdf',dpi=300, bbox_inches = "tight")

I tried the above method as specified by many answers here but that resulted in the following warning,我在这里尝试了许多答案指定的上述方法,但这导致了以下警告,

Locator attempting to generate 1874 ticks ([-28.208333333333332, ..., 596.125]), which exceeds Locator.MAXTICKS (1000).

The figure also displayed many vertical lines.该图还显示了许多垂直线。 I tried converting my time column specifically to timedelta and it still did not help.我尝试将我的时间列专门转换为 timedelta,但它仍然没有帮助。 I converted to timedelta as below.我转换为 timedelta 如下。

custom_date_parser = lambda x: pd.to_timedelta(x.split('.')[0])
df0 = pd.read_csv(fil, names=['Time', 'Count'], delim_whitespace=True, parse_dates=['Time']), date_parser=custom_date_parser)

Could you please help me to identify the issue and set the xticks interval correctly?您能帮我找出问题并正确设置 xticks 间隔吗?

The problem here is that a) matplotlib/pandas don't have much support for timedelta objects and b) you cannot use the HourLocator with your data because after conversion to a datetime object, your axis would be labelled 0, 8, 16, 0, 8, 16... Instead, we can convert the timedelta imported by your converter into hours and plot the numerical values:这里的问题是 a)matplotlib/pandas 对 timedelta 对象没有太多支持,b)您不能将 HourLocator 与您的数据一起使用,因为在转换为日期时间 object 后,您的轴将标记为0, 8, 16, 0, 8, 16...相反,我们可以将您的转换器导入的 timedelta 转换为小时,并将 plot 转换为数值:

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator
import numpy as np

custom_date_parser = lambda x: pd.to_timedelta(x.split('.')[0])
df0 = pd.read_csv("test.txt", names=['Time', 'Count'], delim_whitespace=True, parse_dates=['Time'], date_parser=custom_date_parser)

#conversion into numerical hour value
df0["Time"] /= np.timedelta64(1, "h")
df0 = df0.set_index('Time')

ax = matplotlib.pyplot.gca()
df0.plot(ax=ax, x_compat=True, color='blue')

mkfunc = lambda x, pos: '%1.1fM' % (x * 1e-6) if x >= 1e6 else '%1.1fK' % (x * 1e-3) if x >= 1e3 else '%1.1f' % x
mkformatter = matplotlib.ticker.FuncFormatter(mkfunc)
ax.yaxis.set_major_formatter(mkformatter)

#set locator at regular hour intervals
ax.xaxis.set_major_locator(MultipleLocator(8))
ax.set_xlabel("Time (in h)")
plt.grid()

plt.show()

Sample output:样品 output: 在此处输入图像描述

If for reasons unknown you actually need datetime objects, you can convert your timedelta values using an arbitrary offset, as you intend to ignore the day value:如果由于未知原因您实际上需要 datetime 对象,则可以使用任意偏移量转换 timedelta 值,因为您打算忽略 day 值:

df0["Time"] += pd.to_datetime("2000-01-01 00:00:00 UTC") 

But I doubt this will be of advantage in your case.但我怀疑这对你的情况有好处。

As an aside - for debugging, it is useful not to use regularly spaced test data.顺便说一句 - 对于调试,不使用规则间隔的测试数据很有用。 In your example, you probably did not notice that the graph was plotted against the index (0, 1, 2...) and then relabeled with strings, imitating regularly spaced datetime objects.在您的示例中,您可能没有注意到该图是根据索引 (0, 1, 2...) 绘制的,然后用字符串重新标记,模仿有规律的间隔日期时间对象。 The following test data immediately reveal the problem.下面的测试数据立即揭示了问题所在。

0:0:0 29
0:5:0 85
0:10:0 141
3:15:0 98
5:20:0 251
17:25:0 308
27:30:0 63
35:35:0 413

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM