Select 每月的第一天/第十五天加上 python 的前后一天

Question

我想 select 表格中的特定天數來計算每個特定組的平均值。 我的表有大約 9000 行，如下所示：示例數據

我想 select 每個月的第一個值，一個月的最后一個值，一個月的第二個值，-每 15 日，-15 日的前一天，-15 日后的一天只有一個值

目的是計算每個特定組的平均值。

結果應如下所示：結果

我正在努力計算第 15 次/之前/之后以及“第一次之后”。

到目前為止我嘗試的是：

import pandas as pd
df = pd.read_csv
df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)

"Average first of month"
dffirst = df[~df.index.to_period('m').duplicated()]
monthly_first = dffirst['Value'].mean()

"Average last of month"
dflast = df.resample("M").max()
monthly_last = dflast['Value'].mean()

謝謝

Answer 1

據我了解，一些日期可能會丟失，這使得它有點復雜。

我要做的是從那里跟蹤一個月內可用的第一個/最后一個日期的索引以及 go 的索引。 即第一個索引 +1 獲得第二個，第一個索引 +14 獲得第 15 個可用日期。 那么平均值的計算就很簡單了。

但是，您必須確保存在移位索引（例如，沒有負索引，沒有超過數據幀長度的索引）。

對於下面的代碼，我假設日期在索引列中。

# get indices of first dates available
# get indices of beginning of month as list: df.resample("MS").mean().index.tolist()
# list comprehension to get the index of the next value available (method="bfill") in the dataframe
indices_first = np.asarray([df.index.get_loc(d, method="bfill") for d in df.resample("MS").mean().index.tolist()])

# get indices of last dates available
# method is here "ffill" and resample("M")
indices_last = np.asarray([df.index.get_loc(d, method="ffill") for d in df.resample("M").mean().index.tolist()])

# to get indices of 15th dates available
indices_15 = indices_first + 14
indices_15 = indices_15[indices_15 < len(df)]

# to get indices before last
indices_before_last = indices_last - 1
indices_before_last = indices_before_last[indices_15 >= 0]

然后，您可以訪問 dataframe 的相應行：

avg_first = df.iloc[indices_first]['Value'].mean()
avg_15th = df.iloc[indices_15]['Value'].mean()
avg_before_last = df.iloc[indices_before_last]['Value'].mean()
avg_last = df.iloc[indices_last]['Value'].mean()

Select 每月的第一天/第十五天加上 python 的前后一天

問題描述

1 個解決方案

解決方案1
0 2021-04-07 15:50:06

Select 每月的第一天/第十五天加上 python 的前后一天

問題描述

1 個解決方案

解決方案1 0 2021-04-07 15:50:06

解決方案1
0 2021-04-07 15:50:06