简体   繁体   中英

Anolamy detection in time series data using python

I am trying to write a python code which detects anomalies in time series data. My input data looks something like this: 输入数据 Here, the regions marked in red are anomalies. I want it such that I get the x-coordinate of data-points which are anomalous. So far I have tried a basic if condition (ie if rate < 100, data-point is anomalous) and various statistical techniques like: Mean, Standard deviation, Rolling average with different window sizes etc. However, none of them have worked well. Is there a way to achieve what i want with using some statistical methods? If there are no simple ways to do this, I understand that I have to look to machine learning algorithms. In that case which algorithm would be suitable for my dataset? Thank you.

It looks as if your data comes in lumps, if you are able to distinguish between the lumps (maybe a certain delay between two samples), you can look at the distribution of the samples in the lump. If you know that your rate will never drop below 100, I would start with that, to clean it up a bit,then look at the remaining distribution. The mode value should kind of help identify the "middle", most occuring value. Cutting off everything a certain amount of standard deviations would maybe work to get clean data, but no guarantee that you won't cut off any of your required data.

Edit: you'd have to bin your data before getting the mode.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM