简体   繁体   English

从Python现有序列中生成扩展的时间序列数据

[英]Generating expanded time series data from existing series in Python

Well, I searched for the to this solution for long but I can't find it, even though I believe it will be something pretty easy to do. 好吧,我已经在这个解决方案上搜索了很长时间,但是我找不到它,尽管我认为这很容易做到。 I have a time series, in 1 hr intervals for one year. 我有一个时间序列,以1小时为间隔,一年。 What I want to do is to create fake data for next years, by tinkering with my original data just a bit. 我想做的是通过对原始数据进行一些修改来创建下一年的伪造数据。 For example, if my original data looks like that 例如,如果我的原始数据如下所示

Date standard   Estimated production 

1/1/2016 7:00   0,0  
1/1/2016 8:00   0,0  
1/1/2016 9:00   16,3  
1/1/2016 10:00  29,4   
1/1/2016 11:00  40,6  
1/1/2016 12:00  33,9

(it continues like that until the end of the year), I would like to create fake data that for each respective date, is similar. (这样一直持续到年底),我想创建每个日期相似的假数据。

Date standard   Estimated production 

1/1/2017 7:00   0,01  
1/1/2017 8:00   0,03  
1/1/2017 9:00   16,1 
1/1/2017 10:00  29,3  
1/1/2017 11:00  40,8  
1/1/2017 12:00  33,1

The above changes are of course totally random, the production should be increased or decreased by a number within a set limit. 上述变化当然是完全随机的,应在设定的极限内增加或减少产量。 Thank you in advance! 先感谢您!

You could use DateOffset to shift the index by one year (and then you can modify the values as you want). 您可以使用DateOffset将索引移动一年(然后可以根据需要修改值)。

To generate noise, you could look at numpy random utils. 为了产生噪音,您可以查看numpy 随机工具

import numpy as np
# Same values as 2016, but dates shifted by 1 year (2017)
fake_data = df.loc['2016'].copy()
fake_data.index = fake_data.index + pd.DateOffset(years=1)

# Add gaussian noise, with same standard deviation of production
noise = np.random.randn(len(fake_data)) * fake_data['production'].std()
fake_data['production'] = fake_data['production'] + noise

new_data = pd.concat([data, new_data], axis=0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM