简体   繁体   English

如何检测和过滤时间序列数据中的峰值?

[英]How to detect and filter peaks over time series data?

I have a pandas dataframe of user logins like this: 我有一个这样的用户登录熊猫数据框:

    id     datetime_login 
    646  2017-03-15 15:30:25
    611  2017-04-14 11:38:30
    611  2017-05-15 08:49:01
    651  2017-03-15 15:30:25
    611  2017-03-15 15:30:25
    652  2017-03-08 14:03:56
    652  2017-03-08 14:03:56
    652  2017-03-15 15:30:25
    654  2017-03-15 15:30:25
    649  2017-03-15 15:30:25
    902  2017-09-09 15:00:00
    902  2017-02-13 16:39:53
    902  2017-11-15 12:00:00
    902  2017-11-15 12:00:00
    902  2017-09-09 15:00:00
    902  2017-05-15 08:48:47
    902  2017-11-15 12:00:00

After plotting the logins: 绘制登录名后:

df.datetime_login = df.datetime_login.apply(lambda x: str(x)[:10])
df.datetime_login = df.datetime_login.apply(lambda x: date(int(x[:4]), int(x[5:7]), int(x[8:10])))


fig, ax = subplots()
df.datetime_login.value_counts().sort_index().plot(figsize=(25,10), colormap='jet',fontsize=20)
  1. How can I detect in my plot the peaks in the time series data? 如何在图表中检测时间序列数据中的峰值?

  2. How can I filter into an array the peaks in my time series data? 如何将时间序列数据中的峰值过滤到阵列中?

I tried to: 我试过了:

import peakutils
indices = peakutils.indexes(df, thres=0.4, min_dist=1000)
print(indices) 

However, I got: 但是,我得到了:

TypeError: unsupported operand type(s) for -: 'datetime.date' and 'int'

However, I got: 但是,我得到了:

Where df.datetime_login.value_counts().sort_index().plot(figsize=(25,10), colormap='jet',fontsize=20) plots: 其中df.datetime_login.value_counts().sort_index().plot(figsize=(25,10), colormap='jet',fontsize=20)绘制:

在此处输入图片说明

Let's try the following, you need to use the series returned by value_counts instead of your original df, peakutils.indexes : 让我们尝试以下操作,您需要使用value_counts返回的系列而不是原始的df peakutils.indexes

df_counts = df.datetime_login.value_counts().sort_index()
df_counts[peakutils.indexes(df_counts, thres=0.4, min_dist=1000)]

Output: 输出:

2017-03-15 15:30:25    6
Name: datetime_login, dtype: int64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何检测水流时间序列中的异常峰值? - How to detect outlier peaks in a water flow time series? 如何在 python 中的 0 点之间找到时间序列数据的最高峰? - How to find highest peaks in time series data between points of 0 in python? 计算时间序列中的峰值 - Counting Peaks in a Time Series 如何检测时间序列数据是否几乎没有变化? - How to detect if there is little to no change in time series data? 如何检测时间序列数据的变化是否不再重要? - How to detect if change in time series data is no longer significant? 如何检测多变量、多时间序列数据中的异常? - How to detect anomalies in multivariate, multiple time-series data? 如何在Pyspark中使用滑动窗口对时间序列数据进行数据转换 - How to transform data with sliding window over time series data in Pyspark 如何在不绘制的情况下检测python中时间序列数据的季节性 - how to detect seasonality in a time series data in python without plotting it 如何检测时间序列数据(特别是)中存在趋势和季节性的异常? - How to detect anomaly in a time series data(specifically) with trend and seasonality present in it? 如何检测信号或时间序列数据中的正弦模式? - How to detect sine pattern in a signal or time-series data?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM