[英]select records by date, where date is 15th ot the previous business day of the month
I am trying to select the records that correspond to each 15th of the month, here is the examle of the data:我正在尝试 select 对应于每月 15 日的记录,这是数据的示例:
date open date1 date2
0 2000-01-03 00:00:00 0 2000-01-17 00:00:00 2000-02-15 00:00:00
1 2000-01-07 00:00:00 6084704 2000-01-17 00:00:00 2000-02-15 00:00:00
3 2000-01-12 00:00:00 992482 2000-01-17 00:00:00 2000-02-15 00:00:00
4 2000-01-17 00:00:00 11721104 2000-02-15 00:00:00 2000-03-15 00:00:00
5 2000-01-18 00:00:00 4459200 2000-02-15 00:00:00 2000-03-15 00:00:00
6 2000-01-19 00:00:00 3652363 2000-02-15 00:00:00 2000-03-15 00:00:00
7 2000-01-27 00:00:00 17421705 2000-02-15 00:00:00 2000-03-15 00:00:00
...
43 2000-03-24 00:00:00 13594539 2000-04-17 00:00:00 2000-05-15 00:00:00
44 2000-03-27 00:00:00 5122526 2000-04-17 00:00:00 2000-05-15 00:00:00
45 2000-03-29 00:00:00 3041827 2000-04-17 00:00:00 2000-05-15 00:00:00
46 2000-03-30 00:00:00 4769960 2000-04-17 00:00:00 2000-05-15 00:00:00
48 2000-03-31 00:00:00 15982410 2000-04-17 00:00:00 2000-05-15 00:00:00
49 2000-04-04 00:00:00 3178232 2000-04-17 00:00:00 2000-05-15 00:00:00
51 2000-04-05 00:00:00 15571975 2000-04-17 00:00:00 2000-05-15 00:00:00
52 2000-04-06 00:00:00 2694165 2000-04-17 00:00:00 2000-05-15 00:00:00
53 2000-04-07 00:00:00 536426 2000-04-17 00:00:00 2000-05-15 00:00:00
55 2000-04-10 00:00:00 5523700 2000-04-17 00:00:00 2000-05-15 00:00:00
56 2000-04-11 00:00:00 11212425 2000-04-17 00:00:00 2000-05-15 00:00:00
58 2000-04-12 00:00:00 546223 2000-04-17 00:00:00 2000-05-15 00:00:00
61 2000-04-13 00:00:00 17913960 2000-05-15 00:00:00 2000-06-15 00:00:00
63 2000-04-17 00:00:00 3824185 2000-05-15 00:00:00 2000-06-15 00:00:00
...
1578 2006-01-03 00:00:00 4453005 2006-01-16 00:00:00 2006-02-15 00:00:00
1579 2006-01-04 00:00:00 6557373 2006-01-16 00:00:00 2006-02-15 00:00:00
1580 2006-01-05 00:00:00 2837987 2006-01-16 00:00:00 2006-02-15 00:00:00
...
1634 2006-03-14 00:00:00 31819879 2006-04-17 00:00:00 2006-05-15 00:00:00
1635 2006-03-20 00:00:00 1541321 2006-04-17 00:00:00 2006-05-15 00:00:00
1636 2006-03-21 00:00:00 47047150 2006-04-17 00:00:00 2006-05-15 00:00:00
1638 2006-03-22 00:00:00 6111712 2006-04-17 00:00:00 2006-05-15 00:00:00
Sometimes there is no 15th, since I have a range of business days.有时没有第 15 天,因为我有一系列工作日。 For such case I'd like to choose the record closes to 15th of a month.
对于这种情况,我想选择接近一个月 15 日的记录。 So, the expected outcome would be smth like:
因此,预期的结果将是:
3 2000-01-12 00:00:00 992482 2000-01-17 00:00:00 2000-02-15 00:00:00
...
61 2000-04-13 00:00:00 17913960 2000-05-15 00:00:00 2000-06-15 00:00:00
...
1634 2006-03-14 00:00:00 31819879 2006-04-17 00:00:00 2006-05-15 00:00:00
I figured out how to find every 15th of the month, that is not a very difficult:我想出了如何找到每个月的 15 号,这不是很困难:
mid_date_mask = df['date'].map(lambda x: x.day) == 15
df[mid_date_mask]
But I cannot understand how to incorporate a step back when there is no 15th in a month.但是当一个月没有第 15 天时,我无法理解如何退后一步。
I would be very grateful for your help!我将非常感谢您的帮助!
You could try if the following works for you:如果以下内容适合您,您可以尝试:
mid_date_idx = (
(df["date"].dt.day - 15).abs()
.groupby([df["date"].dt.year, df["date"].dt.month])
.idxmin()
.to_list()
)
mid_date_mask = df.index.isin(mid_date_idx)
df
is your dataframe df
是你的 dataframedate
with the absolute difference of the day and 15
.15
的绝对差异从列date
构建一个系列。.idxmin()
to get the first index with the minimum per group, and collect them in a list mid_date_idx
..idxmin()
获取每组最小值的第一个索引,并将它们收集到一个列表mid_date_idx
中。mid_date_mask
by checking if the index is in the list.mid_date_mask
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.