簡體   English   中英

如何使用 Pandas 自動繪制來自非常大的時間序列的多個“塊”數據?

[英]How can I automate the plotting of multiple 'chunks' of data from a very large time-series using Pandas?

我的目標是能夠從名為“parsed.csv”的大型時間序列數據集中為“event.csv”中的每個事件生成時間序列 plot。

通過根據需要手動定義具有 +/- 12 小時緩沖區的事件的所需時間范圍,我能夠成功地 plot 單個事件。 有數百個事件,使得某種自動化成為必要。 我對循環/自動化非常陌生,並且非常卡住。

代碼:

import matplotlib.pyplot as plt
import pandas as pd

df_event = pd.read_csv('event.csv',parse_dates['Date_Time'],index_col= ['Date_Time'])
df = pd.read_csv('parsed.csv',parse_dates=['Date_Time'],index_col= ['Date_Time'])

df.Verified = pd.to_numeric(df.Verified, errors='coerce')          #forces columns to float64 dtype
df.dropna(axis='index',how='any',inplace=True)                     #fixes any null values



df = df.loc['2018-05-01':'2018-05-06']                            #can manually define event using this


fig, axs = plt.subplots(figsize=(12, 6))                          #define axis, and plots
df.plot(ax=axs)



Sample of my large time-series csv dataset:

                        Predicted  Verified
Date_Time                               
2010-01-01 00:00:00      5.161      5.56
2010-01-01 00:06:00      5.187      5.57
2010-01-01 00:12:00      5.208      5.56
2010-01-01 00:18:00      5.222      5.55
2010-01-01 00:24:00      5.230      5.53
                       ...       ...
2020-12-31 23:30:00      3.342      3.81
2020-12-31 23:36:00      3.447      3.92
2020-12-31 23:42:00      3.549      4.03
2020-12-31 23:48:00      3.646      4.14
2020-12-31 23:54:00      3.739      4.24



Event.csv sample:

                        Verified
Date_Time                               
2010-01-06 12:05:00      5.161      
2010-03-13 02:06:00      5.187      
2010-07-24 06:13:00      5.208      





這是使用 Bokeh 制作有趣的交互式 plot 的方法。

from bokeh.plotting import figure, show, output_file, ColumnDataSource, save
from bokeh.models import WheelZoomTool
from bokeh.models import PanTool
from bokeh.models import ResetTool
from bokeh.models import SaveTool
from bokeh.models import BoxZoomTool
from bokeh.models import CrosshairTool
from bokeh.models import HoverTool


df_dict = {'Date_Time': {0: '2010-01-01 00:00:00',
  1: '2010-01-01 00:06:00',
  2: '2010-01-01 00:12:00',
  3: '2010-01-01 00:18:00',
  4: '2010-01-01 00:24:00',
  5: '2020-12-31 23:30:00',
  6: '2020-12-31 23:36:00',
  7: '2020-12-31 23:42:00',
  8: '2020-12-31 23:48:00',
  9: '2020-12-31 23:54:00'},
 'Predicted': {0: 5.161,
  1: 5.187,
  2: 5.208,
  3: 5.222,
  4: 5.23,
  5: 3.342,
  6: 3.447,
  7: 3.549,
  8: 3.646,
  9: 3.739},
 'Verified': {0: 5.56,
  1: 5.57,
  2: 5.56,
  3: 5.55,
  4: 5.53,
  5: 3.81,
  6: 3.92,
  7: 4.03,
  8: 4.14,
  9: 4.24}}


df = pd.DataFrame(df_dict)

# Bokeh likes this for the x_axis_type with time-series
df['Date_Time'] = pd.to_datetime(df['Date_Time'])
df.set_index('Date_Time', inplace=True)

# Bokeh likes string for tooltips
df['Date'] = df.index.astype(str) 

p = figure(plot_width=1200,
           plot_height=800,
           x_axis_type="datetime",
           y_range=(1, 10),
           title="sample plot")

    
col_list = ['Predicted', 'Verified'] #or df.columns
color_list =['red', 'blue']

for col, color in zip(col_list, color_list):
 
    source = df

    rend = p.line(x='Date_Time',
           y=col,
           source=source,
           legend_label=col,
           color=color,
           line_width=1.5)


    p.add_tools(HoverTool(renderers=[rend],
                          tooltips=[("Value", "@{" + col + "}"),
                          ("Date_Time",  "@{Date}")],
                          mode='mouse'))


p.legend.click_policy="hide"
p.yaxis.axis_label='values'
p.xaxis.axis_label='Date'


show(p)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM