简体   繁体   English

需要每 15 分钟采样一次并绘制的时间序列数据

[英]Time series data that needs to be sampled to every 15 minutes and plotted

I have a data frame that looks like this:我有一个看起来像这样的数据框:

                 counts month
login_time      
1970-03-14 17:45:52 3   Mar
1970-01-09 01:31:25 3   Jan
1970-04-12 04:03:15 3   Apr
1970-02-24 23:09:57 3   Feb
1970-04-04 01:17:40 3   Apr
1970-02-12 11:16:53 3   Feb
1970-03-17 01:01:39 3   Mar
1970-01-06 21:45:52 3   Jan
1970-03-29 03:24:57 3   Mar
1970-04-03 14:42:38 2   Apr

I would like to aggregate these login counts by 15 min intervals and then plot the results.我想以 15 分钟的间隔汇总这些登录计数,然后将结果汇总为 plot。

I tried the following:我尝试了以下方法:

df.groupby('login_time').resample('15min').count()

but the way it resamples doesn't seem correct但它重新采样的方式似乎不正确

        counts  month
login_time  login_time      
1970-01-01 20:12:16 1970-01-01 20:00:00 1   1
1970-01-01 20:13:18 1970-01-01 20:00:00 1   1
1970-01-01 20:16:10 1970-01-01 20:15:00 1   1
1970-01-01 20:16:36 1970-01-01 20:15:00 1   1
1970-01-01 20:16:37 1970-01-01 20:15:00 1   1
1970-01-01 20:21:41 1970-01-01 20:15:00 1   1
1970-01-01 20:26:05 1970-01-01 20:15:00 1   1
1970-01-01 20:26:21 1970-01-01 20:15:00 1   1
1970-01-01 20:31:03 1970-01-01 20:30:00 1   1
1970-01-01 20:34:46 1970-01-01 20:30:00 1   1

Thank you!谢谢!

Not sure if that's exactly what you meant, since you did not specify if you're interested in bins of 15 min from midnight or from the beginning of the dataset, but here's something that I think would work:不确定这是否正是您的意思,因为您没有指定您是否对午夜后 15 分钟或从数据集开始时的 bin 感兴趣,但我认为这是可行的:

I generated random dates in some range (to have something to bin) using that answer .我使用该答案生成了某个范围内的随机日期(以便有一些东西要装箱)。

import pandas as pd
import numpy as np

# Make some fake data
def random_date_generator(start_date, range_in_days):
    days_to_add = np.arange(0, range_in_days)
    random_date = np.datetime64(start_date) + np.random.choice(days_to_add)
    return random_date

data_length = 1000
date_col = [random_date_generator('1970-01-01 00:00:00', 100000) for dc in np.arange(data_length)]
count_col = np.random.randint(5, size = data_length)

# Sample:
df = pd.DataFrame({'login_time':date_col, 'counts': count_col})
df = df.set_index(['login_time'])

df.resample('15T').count()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM