简体   繁体   English

使用周期列将月份添加到数据框

[英]Add Months to Data Frame using a period column

I'm looking to add a %Y%m%d date column to my dataframe using a period column that has integers 1-32, which represent monthly data points starting at a defined environment variable "odate" (eg if odate=20190531 then period 1 should be 20190531, period 2 should be 20190630, etc.) 我希望使用具有整数1-32的period列将%Y%m%d date列添加到我的数据框中,该列表示从定义的环境变量“ odate”开始的每月数据点(例如,如果odate = 20190531则期间1应该是20190531,期间2应该是20190630,依此类推)

I tried defining a dictionary with the number of periods in the column as the keys and the value being odate + MonthEnd(period -1) 我尝试定义一个字典,该字典以列中的句点数作为键,值是odate + MonthEnd(period -1)

This works fine and well; 效果很好。 however, I want to improve the code to be flexible given changes in the number of periods. 但是,鉴于周期数的变化,我想提高代码的灵活性。

Is there a function that will allow me to fill the date columns with the odate in period 1 and then subsequent month ends for subsequent periods? 是否有一个函数可以让我在时段1中用odate填充日期列,然后在随后的时段中以下一个月结束?

example dataset: 示例数据集:

odate=20190531 odate = 20190531

period value
1      5.5
2      5
4      6.2
3      5
5      40
11     5

desired dataset: 所需的数据集:

odate=20190531 odate = 20190531

period value date
1      5.5   2019-05-31
2      5     2019-06-30
4      6.2   2019-08-31
3      5     2019-07-31
5      40    2019-09-30
11     5     2020-03-31

You can use pd.date_range() : 您可以使用pd.date_range()

pd.date_range(start = '2019-05-31', periods = 100,freq='M')

You can change total periods depending on what you need, the freq='M' means a Month-End frequency 您可以根据需要更改总期限, freq='M'表示月末频率

Here is a list of Offset Aliases you can for freq parameter. 是您可以为freq参数设置的偏移别名的列表。

If you just want to add or subtract some period to a date you can use pd.DataOffset : 如果您只想在日期上加上或减去一些句点,可以使用pd.DataOffset

odate = pd.Timestamp('20191031')
odate
>> Timestamp('2019-10-31 00:00:00')

odate - pd.DateOffset(months=4)
>> Timestamp('2019-06-30 00:00:00')

odate + pd.DateOffset(months=4)
>> Timestamp('2020-02-29 00:00:00')

To add given the period column to Month Ends: 要将给定的期间列添加到月末:

odate = pd.Timestamp('20190531')
df['date'] = df.period.apply(lambda x: odate + pd.offsets.MonthEnd(x-1))
df
 period value   date
0   1   5.5     2019-05-31
1   2   5.0     2019-06-30
2   4   6.2     2019-08-31
3   3   5.0     2019-07-31
4   5   40.0    2019-09-30
5   11  5.0     2020-03-31

To improve performance use list-comprehension : 要提高性能,请使用list-comprehension

df['date'] = [odate + pd.offsets.MonthEnd(period-1) for period in df.period]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM