简体   繁体   English

Python大熊猫绘制带有时间间隔的时间序列

[英]Python pandas plot time-series with gap

I am trying to plot a pandas DataFrame with TimeStamp indizes that has a time gap in its indizes. 我正在尝试绘制带有TimeStamp indizes的pandas DataFrame,它在其indizes中有时间间隔。 Using pandas.plot() results in linear interpolation between the last TimeStamp of the former segment and the first TimeStamp of the next. 使用pandas.plot()会在前一个分段的最后一个时间戳与下一个分段的第一个时间戳之间进行线性插值。 I do not want linear interpolation, nor do I want empty space between the two date segments. 我既不想线性插值,也不要两个日期段之间的空白。 Is there a way to do that? 有没有办法做到这一点?

Suppose we have a DataFrame with TimeStamp indizes: 假设我们有一个带有时间戳的DataFrame:

>>> import numpy as np
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> df = pd.DataFrame(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
>>> df = df.cumsum()

Now lets take two time chunks of it and plot it: 现在让我们取其中的两个时间块并将其绘制:

>>> df = pd.concat([df['Jan 2000':'Aug 2000'], df['Jan 2001':'Aug 2001']])
>>> df.plot()
>>> plt.show()

The resulting plot has an interpolation line connecting the TimeStamps enclosing the gap. 生成的图具有一条插值线,该插值线连接包围间隙的时间戳。 I cannot figure out how to upload pictures on this machine, but these pictures from Google Groups show my problem (interpolated.jpg, no-interpolation.jpg and no gaps.jpg). 我无法弄清楚如何在本机上上传图片,但是来自Google网上论坛的这些图片显示了我的问题(interpolated.jpg,no-interpolation.jpg和no gaps.jpg)。 I can recreate the first as shown above. 我可以重新创建第一个,如上所示。 The second is achievable by replacing all gap values with NaN (see also this question ). 第二个可以通过用NaN替换所有的间隙值来实现(另请参见此问题 )。 How can I achieve the third version, where the time gap is omitted? 如何获得省略时间间隔的第三个版本?

Try: 尝试:

df.plot(x=df.index.astype(str))

跳过差距

You may want to customize ticks and tick labels. 您可能需要自定义刻度线和刻度线标签。

EDIT 编辑

That works for me using pandas 0.17.1 and numpy 1.10.4. 这适用于我使用pandas 0.17.1和numpy 1.10.4的情况。

All you really need is a way to convert the DatetimeIndex to another type which is not datetime-like. 您真正需要的是将DatetimeIndex转换为另一种与datetime不相似的类型的方法。 In order to get meaningful labels I chose str . 为了获得有意义的标签,我选择了str If x=df.index.astype(str) does not work with your combination of pandas/numpy/whatever you can try other options: 如果x=df.index.astype(str)不适用于您的pandas / numpy /组合,则可以尝试其他选择:

df.index.to_series().dt.strftime('%Y-%m-%d')
df.index.to_series().apply(lambda x: x.strftime('%Y-%m-%d'))
...

I realized that resetting the index is not necessary so I removed that part. 我意识到没有必要重置索引,因此我删除了该部分。

In my case I had DateTimeIndex objects instead of TimeStamp, but the following works for me in pandas 0.24.2 to eliminate the time series gaps after converting the DatetimeIndex objects to string. 就我而言,我有DateTimeIndex对象而不是TimeStamp,但是以下内容在pandas 0.24.2中对我有用,以消除将DatetimeIndex对象转换为字符串后的时间序列差距。

df = pd.read_sql_query(sql, sql_engine)
df.set_index('date'), inplace=True)
df.index = df.index.map(str)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM