[英]Generate random timeseries data with dates
I am trying to generate random data(integers) with dates so that I can practice pandas data analytics commands on it and plot time series graphs. 我正在尝试使用日期生成随机数据(整数),以便我可以在其上练习pandas数据分析命令并绘制时间序列图。
temp depth acceleration
2019-01-1 -0.218062 -1.215978 -1.674843
2019-02-1 -0.465085 -0.188715 0.241956
2019-03-1 -1.464794 -1.354594 0.635196
2019-04-1 0.103813 0.194349 -0.450041
2019-05-1 0.437921 0.073829 1.346550
Is there any random dataframe generator that can generate something like this with each date having a gap of one month? 是否有任何随机数据帧生成器可以生成这样的事情,每个日期有一个月的差距?
You can either use pandas.util.testing 您可以使用pandas.util.testing
import pandas.util.testing as testing
import numpy as np
np.random.seed(1)
testing.N, testing.K = 5, 3 # Setting the rows and columns of the desired data
print testing.makeTimeDataFrame(freq='MS')
>>>
A B C
2000-01-01 -0.488392 0.429949 -0.723245
2000-02-01 1.247192 -0.513568 -0.512677
2000-03-01 0.293828 0.284909 1.190453
2000-04-01 -0.326079 -1.274735 -0.008266
2000-05-01 -0.001980 0.745803 1.519243
Or, if you need more control over the random values being generated, you can use something like 或者,如果您需要更多地控制生成的随机值,您可以使用类似的东西
import numpy as np
import pandas as pd
np.random.seed(1)
rows,cols = 5,3
data = np.random.rand(rows,cols) # You can use other random functions to generate values with constraints
tidx = pd.date_range('2019-01-01', periods=rows, freq='MS') # freq='MS'set the frequency of date in months and start from day 1. You can use 'T' for minutes and so on
data_frame = pd.DataFrame(data, columns=['a','b','c'], index=tidx)
print data_frame
>>>
a b c
2019-01-01 0.992856 0.217750 0.538663
2019-02-01 0.189226 0.847022 0.156730
2019-03-01 0.572417 0.722094 0.868219
2019-04-01 0.023791 0.653147 0.857148
2019-05-01 0.729236 0.076817 0.743955
Use numpy.random.rand
or numpy.random.randint
functions with DataFrame
constructor: 在
DataFrame
构造函数中使用numpy.random.rand
或numpy.random.randint
函数:
np.random.seed(2019)
N = 10
rng = pd.date_range('2019-01-01', freq='MS', periods=N)
df = pd.DataFrame(np.random.rand(N, 3), columns=['temp','depth','acceleration'], index=rng)
print (df)
temp depth acceleration
2019-01-01 0.903482 0.393081 0.623970
2019-02-01 0.637877 0.880499 0.299172
2019-03-01 0.702198 0.903206 0.881382
2019-04-01 0.405750 0.452447 0.267070
2019-05-01 0.162865 0.889215 0.148476
2019-06-01 0.984723 0.032361 0.515351
2019-07-01 0.201129 0.886011 0.513620
2019-08-01 0.578302 0.299283 0.837197
2019-09-01 0.526650 0.104844 0.278129
2019-10-01 0.046595 0.509076 0.472426
If need integers: 如果需要整数:
np.random.seed(2019)
N = 10
rng = pd.date_range('2019-01-01', freq='MS', periods=N)
df = pd.DataFrame(np.random.randint(20, size=(10, 3)),
columns=['temp','depth','acceleration'],
index=rng)
print (df)
temp depth acceleration
2019-01-01 8 18 5
2019-02-01 15 12 10
2019-03-01 16 16 7
2019-04-01 5 19 12
2019-05-01 16 18 5
2019-06-01 16 15 1
2019-07-01 14 12 10
2019-08-01 0 11 18
2019-09-01 15 19 1
2019-10-01 3 16 18
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.