[英]How can I automate the plotting of multiple 'chunks' of data from a very large time-series using Pandas?
我的目標是能夠從名為“parsed.csv”的大型時間序列數據集中為“event.csv”中的每個事件生成時間序列 plot。
通過根據需要手動定義具有 +/- 12 小時緩沖區的事件的所需時間范圍,我能夠成功地 plot 單個事件。 有數百個事件,使得某種自動化成為必要。 我對循環/自動化非常陌生,並且非常卡住。
代碼:
import matplotlib.pyplot as plt
import pandas as pd
df_event = pd.read_csv('event.csv',parse_dates['Date_Time'],index_col= ['Date_Time'])
df = pd.read_csv('parsed.csv',parse_dates=['Date_Time'],index_col= ['Date_Time'])
df.Verified = pd.to_numeric(df.Verified, errors='coerce') #forces columns to float64 dtype
df.dropna(axis='index',how='any',inplace=True) #fixes any null values
df = df.loc['2018-05-01':'2018-05-06'] #can manually define event using this
fig, axs = plt.subplots(figsize=(12, 6)) #define axis, and plots
df.plot(ax=axs)
Sample of my large time-series csv dataset:
Predicted Verified
Date_Time
2010-01-01 00:00:00 5.161 5.56
2010-01-01 00:06:00 5.187 5.57
2010-01-01 00:12:00 5.208 5.56
2010-01-01 00:18:00 5.222 5.55
2010-01-01 00:24:00 5.230 5.53
... ...
2020-12-31 23:30:00 3.342 3.81
2020-12-31 23:36:00 3.447 3.92
2020-12-31 23:42:00 3.549 4.03
2020-12-31 23:48:00 3.646 4.14
2020-12-31 23:54:00 3.739 4.24
Event.csv sample:
Verified
Date_Time
2010-01-06 12:05:00 5.161
2010-03-13 02:06:00 5.187
2010-07-24 06:13:00 5.208
這是使用 Bokeh 制作有趣的交互式 plot 的方法。
from bokeh.plotting import figure, show, output_file, ColumnDataSource, save
from bokeh.models import WheelZoomTool
from bokeh.models import PanTool
from bokeh.models import ResetTool
from bokeh.models import SaveTool
from bokeh.models import BoxZoomTool
from bokeh.models import CrosshairTool
from bokeh.models import HoverTool
df_dict = {'Date_Time': {0: '2010-01-01 00:00:00',
1: '2010-01-01 00:06:00',
2: '2010-01-01 00:12:00',
3: '2010-01-01 00:18:00',
4: '2010-01-01 00:24:00',
5: '2020-12-31 23:30:00',
6: '2020-12-31 23:36:00',
7: '2020-12-31 23:42:00',
8: '2020-12-31 23:48:00',
9: '2020-12-31 23:54:00'},
'Predicted': {0: 5.161,
1: 5.187,
2: 5.208,
3: 5.222,
4: 5.23,
5: 3.342,
6: 3.447,
7: 3.549,
8: 3.646,
9: 3.739},
'Verified': {0: 5.56,
1: 5.57,
2: 5.56,
3: 5.55,
4: 5.53,
5: 3.81,
6: 3.92,
7: 4.03,
8: 4.14,
9: 4.24}}
df = pd.DataFrame(df_dict)
# Bokeh likes this for the x_axis_type with time-series
df['Date_Time'] = pd.to_datetime(df['Date_Time'])
df.set_index('Date_Time', inplace=True)
# Bokeh likes string for tooltips
df['Date'] = df.index.astype(str)
p = figure(plot_width=1200,
plot_height=800,
x_axis_type="datetime",
y_range=(1, 10),
title="sample plot")
col_list = ['Predicted', 'Verified'] #or df.columns
color_list =['red', 'blue']
for col, color in zip(col_list, color_list):
source = df
rend = p.line(x='Date_Time',
y=col,
source=source,
legend_label=col,
color=color,
line_width=1.5)
p.add_tools(HoverTool(renderers=[rend],
tooltips=[("Value", "@{" + col + "}"),
("Date_Time", "@{Date}")],
mode='mouse'))
p.legend.click_policy="hide"
p.yaxis.axis_label='values'
p.xaxis.axis_label='Date'
show(p)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.