如何使用增量日期时间模拟 pandas dataframe 数据

Question

In python3 and pandas:在 python3 和 pandas 中：

assuming i have a dataframe:假设我有一个 dataframe：

datetime,id,value
2020-03-12,1,100
2020-03-13,1,105
2020-03-14,1,110
2020-03-12,2,100
2020-03-13,2,105
2020-03-14,2,110

I am trying to simulate these datasets with x extra historical days.我正在尝试用 x 个额外的历史天数来模拟这些数据集。

Let us say x=2 for now, and we wont add any new ID.让我们现在说 x=2，我们不会添加任何新 ID。 Just existing IDs in the datasets.只是数据集中现有的 ID。 The value column can be incremental or random.值列可以是增量的或随机的。 Wonder how could I do it?想知道我该怎么做？

The first thing we have to is to extend the time:我们要做的第一件事就是延长时间：

df2=pd.DataFrame(pd.date_range(pd.to_datetime('today'), periods=10, freq='1440min'))

df['datetime']=df['datetime'].append(df2)

then i got然后我得到了

ValueError: cannot reindex from a duplicate axis

Wonder how could I do it?想知道我该怎么做？

Answer 1

one way could be to set_index the datetime and id columns, then reindex with all the dates you want generated through date_range using pd.MultiIndex.from_product and finally reset_index to put them back as columns like:一种方法可能是set_index datetime 和 id 列，然后使用pd.MultiIndex.from_product使用您希望通过date_range生成的所有日期重新reindex ，最后reset_index将它们作为列放回，例如：

#ensure datetime is good format
df['datetime'] = pd.to_datetime(df['datetime'])

#set parameter for extra days
x=2
df_re = df.set_index(['id', 'datetime'])\
          .reindex(pd.MultiIndex.from_product([df['id'].unique(), 
                                               pd.date_range(df['datetime'].min(), 
                                                             df['datetime'].max() + pd.Timedelta(days=x))], 
                                              names=['id', 'datetime']),
                   fill_value=120)\
          .reset_index()

print (df_re)
   id   datetime  value
0   1 2020-03-12    100
1   1 2020-03-13    105
2   1 2020-03-14    110
3   1 2020-03-15    120
4   1 2020-03-16    120
5   2 2020-03-12    100
6   2 2020-03-13    105
7   2 2020-03-14    110
8   2 2020-03-15    120
9   2 2020-03-16    120

如何使用增量日期时间模拟 pandas dataframe 数据

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-04-09 20:21:11

如何使用增量日期时间模拟 pandas dataframe 数据

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-04-09 20:21:11

解决方案1
2 已采纳 2020-04-09 20:21:11