简体   繁体   中英

label a point in graph using matplotlib for timeseries

I have a pandas dataframe with 3 columns. I plot col1 on Y axis and a time_stamps series on X axis. For this series whenever col2 is -1, I want to highlight that point on graph as anomaly. I tried to get the coordinate and highlight using ax.text but I cannot get the correct coordinate since X axis is a time series. In the example below I am trying to plot third row coordinates since col2[2]==-1.

import pandas
import matplotlib.pyplot as plt
df=df[["time_stamps","col1"]]
df.set_index("time_stamps",inplace=True)
ax=df.plot()
ticklabels = [l.get_text() for l in ax.xaxis.get_ticklabels()]
new_labels=[tick[-6:] for tick in ticklabels]
ax.xaxis.set_ticklabels(new_labels)
x1="16965 days 17:52:03"
y1=0.7
ax.text(x1, y1, "anaomly", fontsize=15)
plt.show()

Sample data looks like

time_stamp=[16965 days 17:52:00,16965 days 17:52:02
16965 days 17:52:03,16965 days 17:52:05
16965 days 17:52:06,16965 days 17:52:08
16965 days 17:52:09,16965 days 17:52:11
16965 days 17:52:12,16965 days 17:52:14]
col1=[0.02,0.01,0.7,0.019,0.019,0.017,0.023,0.04,0.072,0.05]  
col2=[1,1,-1,1,1,1,1,1,1,1] 

I figured it out that I can convert it to seconds and then label the points as anomalies. This is what i did.

def changetotimedelta(row): 
    return pd.to_timedelta(row["time_stamps"])/ np.timedelta64(1,'D') 
def main() 
 df=pd.read_csv(inputFile)    
 df["time"]=df.apply(changetotimedelta,axis=1)
 new_df=df[["time","col1"]]
 new_df.set_index("time",inplace=True)
 ax=new_df.plot()
 x1=pd.to_timedelta("16965 days 17:52:03")/ np.timedelta64(1,'D')  
 y1=0.7
 ax.annotate('anomaly', xy=(x1, y1), xytext=(x2, 1),
            arrowprops=dict(facecolor='red', shrink=0.01),)

plt.show()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM