简体   繁体   中英

Time series analysis in Python using conditions

I have the following data (sample)

Symbol Sections      iBid     Bid                Date
0    O.U20       O1  99.73167  99.730 2020-06-29 16:32:25
1    O.Z20       O1  99.70250  99.700 2020-06-29 16:32:25
2    O.H21       O1       NaN  99.795 2020-06-29 16:32:25
3    O.M21       O1  99.81167  99.810 2020-06-29 16:32:25
4    O.U21       O2  99.81667  99.815 2020-06-29 16:32:25
5    O.Z21       O2       NaN  99.795 2020-06-29 16:32:25
6    O.H22       O2  99.81000  99.810 2020-06-29 16:32:25
7    O.M22       O2  99.79500  99.795 2020-06-29 16:32:25
16  F3.U26       F3       NaN   1.000 2020-06-29 16:32:25
17  F3.Z26       F3       NaN  -3.000 2020-06-29 16:32:25
18  F3.H27       F3       NaN  -1.000 2020-06-29 16:32:25
19  F6.H26       F6  -1.75000     NaN 2020-06-29 16:32:25
20  F6.M26       F6  -4.50000     NaN 2020-06-29 16:32:25
21  F6.U26       F6  -5.50000     NaN 2020-06-29 16:32:25
22  F9.U20       F9  -8.50000  -9.000 2020-06-29 16:32:25
23   O.U20       O3  99.73167  99.730 2020-06-29 16:32:26
24   O.Z20       O3  99.70250  99.700 2020-06-29 16:32:26
25   O.H21       O3       NaN  99.795 2020-06-29 16:32:26
26   O.M21       O3  99.81167  99.810 2020-06-29 16:32:26
27   O.U21       O4  99.81667  99.815 2020-06-29 16:32:26
28   O.Z21       O4       NaN  99.795 2020-06-29 16:32:26
29   O.H22       O4  99.81000  99.810 2020-06-29 16:32:26
30   O.M22       O4  99.79500  99.795 2020-06-29 16:32:26

What I want to do is draw a scatterplot or a line chart or any kind of chart that is suitable for such an analysis that can analyze the trend over time if a condition is met. For example, I want to see how many times iBid is higher than Bid overtime for each symbol like (O,S,F) and also for sections (O1,F3 etc)

I know I'm required to present some working but I'm not sure if such a chart is even possible? So far I can only do is sperate the data based on Symbol

df_O = df[df['Contract'].str.contains('O')]

and filter out the results like

IbidgreaterBid = big_frame[(big_frame.iBid > big_frame.Bid)]

Is it possible to obtain a graph that can analyze when is Ibid > Bid with Date column as x axis? (Date column has thousand of rows with the only difference of seconds)

It's not clear what you mean by a graph that can analyze when ibid > bid. However, I can suggest a way to distinguish data based on Ibid >/< Bid. In the following example, red scatter points indicate data points where Ibid > Bid, blue for otherwise. Moreover, because the difference is only on the second's scale, I've made use of mdates date-formatter to set xticks to show HMS only.

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from matplotlib.offsetbox import AnchoredText
import matplotlib.dates as mdates
from datetime import timedelta
plt.style.use('seaborn-whitegrid')

n_sections=df['Sections'].nunique()
cols=2
rows=int(round(n_sections/2.0))
#setup the plot
fig, ax = plt.subplots(rows, cols, figsize=(16,8),sharex=False,sharey=False) # if you want to turn off sharing axis.
row=0 #to iterate over rows/cols
col=0 #to iterate over rows/cols


for index, Section in df.groupby('Sections'):
    ax[row][col].scatter(np.array(Section['Datetime']),Section['iBid'] , color='blue')
    ax[row][col].scatter(np.array(Section['Datetime'][Section['iBid']>Section['Bid']]),Section['iBid'][Section['iBid']>Section['Bid']] , color='red')
    ax[row][col].set_xlim([min(Section['Datetime'])-timedelta(seconds=5), max(Section['Datetime'])+timedelta(seconds=5)])
    ax[row][col].set_xlabel('Date Time',fontsize=20)
    ax[row][col].set_ylabel('iBid',fontsize=20)
    anchored_text = AnchoredText("{}".format(Section['Sections'].unique()[0]), loc=4,prop=dict(size=20))
    ax[row][col].add_artist(anchored_text)

    ax[row][col].xaxis.set_major_formatter(mdates.DateFormatter('%H:%M:%S'))
    ax[row][col].tick_params(axis='both', direction='in', which='major', length=5, width=2,labelsize=16)
    
    row=row+1
    if row==rows:
        row=0
        col=col+1

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM