[英]How to change year and month in a date which is in datetime format?
I have a Python dataframe (8000 rows) with a datetime format column, which has dates like YYYY-MM-DD.我有一个带有日期时间格式列的 Python dataframe(8000 行),其日期类似于 YYYY-MM-DD。 I am looking to change it from being a single date to multiple months, and years, with same day.
我希望将它从单一日期更改为同一天的多个月份和年份。
My Output:我的 Output:
0 data-1 2011-12-03
1 data-2 2011-12-03
2 data-3 2011-12-03
..
..
data-4 2011-12-03
data-5 2011-12-03
7999 data-6 2011-12-03
Expected output:预计 output:
val1 date
0 data-1 2009-01-03
.. 2009-02-03
2009-03-03
..
..
2009-11-03
2009-12-03
11 data-n 2010-01-03
.. 2010-02-03
2010-03-03
..
..
2010-11-03
2010-12-03
2011-01-03
.. 2011-02-03
2011-03-03
..
..
2011-11-03
7999 data-m 2011-12-03
I want it to spread over the 12 months and 5 years.我希望它在 12 个月和 5 年内传播。 I tried:
我试过了:
df.date[0:999] = pd.to_datetime(stallion_df.date[0:999]) + pd.offsets.DateOffset(years=1)
df.date[0:999] = pd.to_datetime(stallion_df.date[0:999]) + pd.offsets.DateOffset(months=3)
...
for 8000 rows for year and month, which is clearly not optimal.对于年和月的 8000 行,这显然不是最佳选择。 Any help would be much appreciated.
任何帮助将非常感激。 Thanks
谢谢
You may basically use date_range to create a series to be added:您基本上可以使用date_range创建要添加的系列:
dates = pd.Series(pd.date_range(start='1/3/2009', end='3/11/2011', freq='M'))
Just replace it with your desired start and end date.只需将其替换为您想要的开始和结束日期。 It will automatically increase months and years by keeping the last day of each month.
它会通过保留每个月的最后一天来自动增加月份和年份。 If you want to set the day to a specific one:
如果要将日期设置为特定日期:
dates.apply(lambda x: x.replace(day=3))
This returns the same but with day 3 for all entries.这将返回相同但所有条目的第 3 天。 If you also want a larger series with repeated days, you may use repeat as:
如果您还想要一个更大的重复日期系列,您可以使用repeat为:
dates.repeat(10).reset_index(drop=True)
So this way you will have the same series but each date is repeated 10 times.因此,这样您将拥有相同的系列,但每个日期重复 10 次。
Is this your expected output?这是您预期的 output 吗? It's constructing the Cartesian product of dates ranging from
'2018-01-03'
to '2022-12-03'
and val
column.它正在构建日期范围从
'2018-01-03'
到'2022-12-03'
和val
列的笛卡尔积。 You have date-1
to date-m
, so I substituted m
with 100. Then you'll get 6000 rows.你有
date-1
到date-m
,所以我用 100 代替m
。然后你会得到 6000 行。
m = 100
out = (pd.MultiIndex.from_product([[f'date-{i}' for i in range(1,m+1)],
pd.date_range('2018-01-01','2022-12-01', freq='MS') + pd.DateOffset(days=2)])
.to_frame(name=['val','date']).reset_index(drop=True))
Output: Output:
val date
0 date-1 2018-01-03
1 date-1 2018-02-03
2 date-1 2018-03-03
3 date-1 2018-04-03
4 date-1 2018-05-03
... ... ...
5995 date-100 2022-08-03
5996 date-100 2022-09-03
5997 date-100 2022-10-03
5998 date-100 2022-11-03
5999 date-100 2022-12-03
[6000 rows x 2 columns]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.