Python如何计算日期后30天的月数

Question

I have a dataframe containing date, and I would like to process the data as follow for feature engineering 我有一个包含日期的数据框，我想按照以下特征工程处理数据

df DF

date
2016/1/1
2015/2/10
2016/4/5

after process I would like to make the df looks like 在进程之后我想让df看起来像

date      Jan    Feb   Mar   Apr    
2016/1/1  30     0    0      0    //date from 1/1 to 1/30 : the number of dates in jan
2015/2/10  0     19   11     0    //date from 2/10 to 3/11 : the number of dates in feb and no of dates in mar
2016/3/25  0     0    7     21    //date from 3/25 to 4/21 : the number of dates in mar and no of dates in apr

get 30 days after the df["date"] df [“日期”]后30天
df["date"] + timedelta(month=1) df [“date”] + timedelta（month = 1）
count the frequency of months which belong to the specific 30 days 计算属于特定30天的月份频率

Is there any method to do this quickly? 有什么方法可以快速完成吗？

Thanks. 谢谢。

Answer 1

Just go step by step. 一步一步走。 First you offset your original date by + pd.to_timedelta('30d') . 首先，您将原始日期偏移+ pd.to_timedelta('30d') 。 Then create a column indicating the month only by df.date.dt.month . 然后创建一个仅由df.date.dt.month指示月份的列。 Then create a column with the end-of-month date for each date - some ideas for that are here: Want the last day of each month for a data frame in pandas . 然后创建一个包含每个日期的月末日期的列 - 这里有一些想法：想要在每个月的最后一天获得pandas中的数据框。 Finally, fill in a matrix where the columns are the 12 months, setting the values in the columns for the month and month+1. 最后，填写一个矩阵，其中列为12个月，在月份和月份的列中设置值+ 1。

By enriching your DataFrame one column at a time, you can easily move from your input to your desired output. 通过一次丰富DataFrame一列，您可以轻松地从输入移动到所需的输出。 There is not likely to be a magic method that does everything in a single call. 在一次通话中不可能有一种神奇的方法来完成所有事情。

Read all about date/time functions in Pandas here: https://pandas.pydata.org/pandas-docs/stable/timeseries.html - there are a lot! 在这里阅读关于熊猫日期/时间函数的所有内容： https ： //pandas.pydata.org/pandas-docs/stable/timeseries.html - 有很多内容！

Answer 2

You can use custom function with date_range and groupby with size : 您可以使用自定义功能与date_range和groupby与size ：

date = df[['date']]
names = ['Jan', 'Feb','Mar','Apr','May']

def f(x):
    print (x['date'])
    a = pd.date_range(x['date'], periods=30)
    a = pd.Series(a).groupby(a.month).size()
    return (a)


df = df.apply(f, axis=1).fillna(0).astype(int)
df = df.rename(columns = {k:v for k,v in enumerate(names)})
df = date.join(df)
print (df)
        date  Feb  Mar  Apr  May
0 2016-01-01   30    0    0    0
1 2015-02-10    0   19   11    0
2 2016-03-25    0    0    7   23

Similar solution with value_counts : 与value_counts类似的解决方案：

date = df[['date']]
names = ['Jan', 'Feb','Mar','Apr','May']

df = df.apply(lambda x: pd.date_range(x['date'], periods=30).month.value_counts(), axis=1)
       .fillna(0)
       .astype(int)
df = df.rename(columns = {k:v for k,v in enumerate(names)})
df = date.join(df)
print (df)

Another solution: 另一种方案：

names = ['Jan', 'Feb','Mar','Apr','May']
date = df[['date']]

df["date1"] = df["date"] + pd.Timedelta(days=29)
df = df.reset_index().melt(id_vars='index', value_name='date').set_index('date')
df = df.groupby('index').resample('D').asfreq()
df = df.groupby([df.index.get_level_values(0), df.index.get_level_values(1).month])
      .size()
      .unstack(fill_value=0)
df = df.rename(columns = {k+1:v for k,v in enumerate(names)})
df = date.join(df)
print (df)
        date  Jan  Feb  Mar  Apr
0 2016-01-01   30    0    0    0
1 2015-02-10    0   19   11    0
2 2016-03-25    0    0    7   23

Python如何计算日期后30天的月数

问题描述

2 个解决方案

解决方案1
2 2017-06-06 11:43:20

解决方案2
1 已采纳 2017-06-06 12:34:40

Python如何计算日期后30天的月数

问题描述

2 个解决方案

解决方案1 2 2017-06-06 11:43:20

解决方案2 1 已采纳 2017-06-06 12:34:40

解决方案1
2 2017-06-06 11:43:20

解决方案2
1 已采纳 2017-06-06 12:34:40