[英]maximum difference between two time series of different resolution
I have two time series data that gives the electricity demand in one-hour resolution and five-minute resolution.我有两个时间序列数据,分别以一小时分辨率和五分钟分辨率给出电力需求。 I am trying to find the maximum difference between these two time series.
我试图找到这两个时间序列之间的最大差异。 So the one-hour resolution data has 8760 rows (hourly for an year) and the 5-minute resolution data has 104,722 rows (5-minutly for an year).
所以一小时分辨率数据有8760行(一年每小时),5分钟分辨率数据有104722行(一年5分钟)。
I can only think of a method that will expand the hourly data into 5 minute resolution that will have 12 times repeating of the hourly data and find the maximum of the difference of the two data sets.我只能想到一种方法,将每小时数据扩展为 5 分钟分辨率,将每小时数据重复 12 次,并找到两个数据集差异的最大值。
If this technique is the way to go, is there an easy way to convert my hourly data into 5-minute resolution by repeating the hourly data 12 times?如果这种技术是可行的方法,是否有一种简单的方法可以通过将每小时数据重复 12 次来将我的每小时数据转换为 5 分钟分辨率?
for your reference I posted a plot of this data for one day.为了您的参考,我发布了一天的数据图。
PS> I am using Python to do this task PS> 我正在使用 Python 来完成这个任务
You can change your hourly data into 5-minute data by using numpy's repeat function您可以使用 numpy 的重复功能将每小时数据更改为 5 分钟数据
import numpy as np
np.repeat(hourly_data, 12)
I would strongly recommend against converting the hourly data into five-minute data.我强烈建议不要将每小时数据转换为五分钟数据。 If the data in both cases refers to the mean load of those time ranges, you'll be looking at more accurate data if you group the five-minute intervals into hourly datasets.
如果这两种情况下的数据都指的是这些时间范围的平均负载,那么如果您将五分钟间隔分组为每小时数据集,您将看到更准确的数据。 You'd get more granularity the way you're talking about, but the granularity is not based on accurate data, so you're not actually getting more value from it.
你会以你所说的方式获得更多的粒度,但粒度不是基于准确的数据,所以你实际上并没有从中获得更多的价值。 If you aggregate the five-minute chunks into hourly chunks and compare the series that way, you can be more confident in the trustworthiness of your results.
如果您将 5 分钟的数据块聚合为每小时的数据块并以这种方式比较系列,您可以对结果的可信度更有信心。
In order to group them together to get that result, you can define a function like the following and use the apply method like so:为了将它们组合在一起以获得该结果,您可以定义如下所示的函数并使用如下所示的 apply 方法:
def to_hour(date):
date = date.strftime("%Y-%m-%d %H:00:00")
date = dt.strptime(date, "%Y-%m-%d %H:%M:%S")
return date
df['Aggregated_Datetime'] = df['Original_Datetime'].apply(lambda x: to_hour(x))
df.groupby('Aggregated_Datetime').agg('Real-Time Lo
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.