获取时间序列熊猫每个月的最后一个日期

Question

Currently I'm generating a DateTimeIndex using a certain function, zipline.utils.tradingcalendar.get_trading_days .目前我正在使用某个函数zipline.utils.tradingcalendar.get_trading_days生成 DateTimeIndex 。 The time series is roughly daily but with some gaps.时间序列大致是每天，但有一些差距。

My goal is to get the last date in the DateTimeIndex for each month.我的目标是获取每个月DateTimeIndex的最后一个日期。

.to_period('M') & .to_timestamp('M') don't work since they give the last day of the month rather than the last value of the variable in each month. .to_period('M') & .to_timestamp('M')不起作用，因为它们给出了当月的最后一天而不是每个月变量的最后一个值。

As an example, if this is my time series I would want to select '2015-05-29' while the last day of the month is '2015-05-31'.例如，如果这是我的时间序列，我想选择“2015-05-29”，而当月的最后一天是“2015-05-31”。

['2015-05-18', '2015-05-19', '2015-05-20', '2015-05-21', '2015-05-22', '2015-05-26', '2015-05-27', '2015-05-28', '2015-05-29', '2015-06-01'] ['2015-05-18'、'2015-05-19'、'2015-05-20'、'2015-05-21'、'2015-05-22'、'2015-05-26'、' 2015-05-27'、'2015-05-28'、'2015-05-29'、'2015-06-01']

Answer 1

Condla's answer came closest to what I needed except that since my time index stretched for more than a year I needed to groupby by both month and year and then select the maximum date. Condla 的回答最接近我的需要，除了因为我的时间索引延长了一年多，我需要按月份和年份分组，然后选择最大日期。 Below is the code I ended up with.下面是我最终得到的代码。

# tempTradeDays is the initial DatetimeIndex
dateRange = []  
tempYear = None  
dictYears = tempTradeDays.groupby(tempTradeDays.year)
for yr in dictYears.keys():
    tempYear = pd.DatetimeIndex(dictYears[yr]).groupby(pd.DatetimeIndex(dictYears[yr]).month)
    for m in tempYear.keys():
        dateRange.append(max(tempYear[m]))
dateRange = pd.DatetimeIndex(dateRange).order()

Answer 2

My strategy would be to group by month and then select the "maximum" of each group:我的策略是按月分组，然后选择每个组的“最大值”：

If "dt" is your DatetimeIndex object:如果“dt”是您的 DatetimeIndex 对象：

last_dates_of_the_month = []
dt_month_group_dict = dt.groupby(dt.month)
for month in dt_month_group_dict:
    last_date = max(dt_month_group_dict[month])
    last_dates_of_the_month.append(last_date)

The list "last_date_of_the_month" contains all occuring last dates of each month in your dataset.列表“last_date_of_the_month”包含数据集中每个月所有出现的最后日期。 You can use this list to create a DatetimeIndex in pandas again (or whatever you want to do with it).您可以使用此列表再次在 Pandas 中创建 DatetimeIndex（或您想用它做的任何事情）。

Answer 3

This is an old question, but all existing answers here aren't perfect.这是一个老问题，但这里所有现有的答案都不完美。 This is the solution I came up with (assuming that date is a sorted index), which can be even written in one line, but I split it for readability:这是我想出的解决方案（假设日期是一个排序索引），它甚至可以写在一行中，但为了可读性我将其拆分：

month1 = pd.Series(apple.index.month)
month2 = pd.Series(apple.index.month).shift(-1)
mask = (month1 != month2)
apple[mask.values].head(10)

Few notes here:这里有一些注意事项：

Shifting a datetime series requires another pd.Series instance (see here )移动日期时间序列需要另一个pd.Series实例（请参阅此处）
Boolean mask indexing requires .values (see here )布尔掩码索引需要.values （参见此处）

By the way, when the dates are the business days , it'd be easier to use resampling: apple.resample('BM')顺便说一句，当日期是工作日时，使用重采样会更容易： apple.resample('BM')

Answer 4

Suppose your data frame looks like this假设您的数据框如下所示

original dataframe原始数据框

Then the following Code will give you the last day of each month.那么下面的代码会给你每个月的最后一天。

df_monthly = df.reset_index().groupby([df.index.year,df.index.month],as_index=False).last().set_index('index')

transformed_dataframe转换数据帧

This one line code does its job :)这一行代码完成了它的工作:)

Answer 5

Maybe the answer is not needed anymore, but while searching for an answer to the same question I found maybe a simpler solution:也许不再需要答案，但是在寻找同一问题的答案时，我发现了一个更简单的解决方案：

import pandas as pd 

sample_dates = pd.date_range(start='2010-01-01', periods=100, freq='B')
month_end_dates = sample_dates[sample_dates.is_month_end]

Answer 6

试试这个，创建一个新的差异列，其中值 1 指向从一个月到下一个月的变化。

     df['diff'] = np.where(df['Date'].dt.month.diff() != 0,1,0)

获取时间序列熊猫每个月的最后一个日期

问题描述

6 个解决方案

解决方案1
6 已采纳 2015-06-10 12:15:02

解决方案2
3 2015-06-09 23:05:20

解决方案3
3 2018-02-21 18:17:26

解决方案4
3 2019-05-24 20:56:54

解决方案5
2 2015-08-21 08:04:25

解决方案6
0 2020-08-06 15:21:48

获取时间序列熊猫每个月的最后一个日期

问题描述

6 个解决方案

解决方案1 6 已采纳 2015-06-10 12:15:02

解决方案2 3 2015-06-09 23:05:20

解决方案3 3 2018-02-21 18:17:26

解决方案4 3 2019-05-24 20:56:54

解决方案5 2 2015-08-21 08:04:25

解决方案6 0 2020-08-06 15:21:48

解决方案1
6 已采纳 2015-06-10 12:15:02

解决方案2
3 2015-06-09 23:05:20

解决方案3
3 2018-02-21 18:17:26

解决方案4
3 2019-05-24 20:56:54

解决方案5
2 2015-08-21 08:04:25

解决方案6
0 2020-08-06 15:21:48