需要每 15 分钟采样一次并绘制的时间序列数据

Question

I have a data frame that looks like this:我有一个看起来像这样的数据框：

                 counts month
login_time      
1970-03-14 17:45:52 3   Mar
1970-01-09 01:31:25 3   Jan
1970-04-12 04:03:15 3   Apr
1970-02-24 23:09:57 3   Feb
1970-04-04 01:17:40 3   Apr
1970-02-12 11:16:53 3   Feb
1970-03-17 01:01:39 3   Mar
1970-01-06 21:45:52 3   Jan
1970-03-29 03:24:57 3   Mar
1970-04-03 14:42:38 2   Apr

I would like to aggregate these login counts by 15 min intervals and then plot the results.我想以 15 分钟的间隔汇总这些登录计数，然后将结果汇总为 plot。

I tried the following:我尝试了以下方法：

df.groupby('login_time').resample('15min').count()

but the way it resamples doesn't seem correct但它重新采样的方式似乎不正确

        counts  month
login_time  login_time      
1970-01-01 20:12:16 1970-01-01 20:00:00 1   1
1970-01-01 20:13:18 1970-01-01 20:00:00 1   1
1970-01-01 20:16:10 1970-01-01 20:15:00 1   1
1970-01-01 20:16:36 1970-01-01 20:15:00 1   1
1970-01-01 20:16:37 1970-01-01 20:15:00 1   1
1970-01-01 20:21:41 1970-01-01 20:15:00 1   1
1970-01-01 20:26:05 1970-01-01 20:15:00 1   1
1970-01-01 20:26:21 1970-01-01 20:15:00 1   1
1970-01-01 20:31:03 1970-01-01 20:30:00 1   1
1970-01-01 20:34:46 1970-01-01 20:30:00 1   1

Thank you!谢谢！

Answer 1

Not sure if that's exactly what you meant, since you did not specify if you're interested in bins of 15 min from midnight or from the beginning of the dataset, but here's something that I think would work:不确定这是否正是您的意思，因为您没有指定您是否对午夜后 15 分钟或从数据集开始时的 bin 感兴趣，但我认为这是可行的：

I generated random dates in some range (to have something to bin) using that answer .我使用该答案生成了某个范围内的随机日期（以便有一些东西要装箱）。

import pandas as pd
import numpy as np

# Make some fake data
def random_date_generator(start_date, range_in_days):
    days_to_add = np.arange(0, range_in_days)
    random_date = np.datetime64(start_date) + np.random.choice(days_to_add)
    return random_date

data_length = 1000
date_col = [random_date_generator('1970-01-01 00:00:00', 100000) for dc in np.arange(data_length)]
count_col = np.random.randint(5, size = data_length)

# Sample:
df = pd.DataFrame({'login_time':date_col, 'counts': count_col})
df = df.set_index(['login_time'])

df.resample('15T').count()

需要每 15 分钟采样一次并绘制的时间序列数据

问题描述

1 个解决方案

解决方案1
1 2020-05-31 01:56:50

需要每 15 分钟采样一次并绘制的时间序列数据

问题描述

1 个解决方案

解决方案1 1 2020-05-31 01:56:50

解决方案1
1 2020-05-31 01:56:50