简体   繁体   中英

select records by date, where date is 15th ot the previous business day of the month

I am trying to select the records that correspond to each 15th of the month, here is the examle of the data:

           date              open               date1          date2
0       2000-01-03 00:00:00  0               2000-01-17 00:00:00    2000-02-15 00:00:00
1       2000-01-07 00:00:00  6084704         2000-01-17 00:00:00    2000-02-15 00:00:00
3       2000-01-12 00:00:00  992482          2000-01-17 00:00:00    2000-02-15 00:00:00
4       2000-01-17 00:00:00  11721104        2000-02-15 00:00:00    2000-03-15 00:00:00
5       2000-01-18 00:00:00  4459200         2000-02-15 00:00:00    2000-03-15 00:00:00
6       2000-01-19 00:00:00 3652363          2000-02-15 00:00:00    2000-03-15 00:00:00
7       2000-01-27 00:00:00 17421705         2000-02-15 00:00:00    2000-03-15 00:00:00
...
43      2000-03-24 00:00:00 13594539         2000-04-17 00:00:00    2000-05-15 00:00:00
44      2000-03-27 00:00:00 5122526          2000-04-17 00:00:00    2000-05-15 00:00:00
45      2000-03-29 00:00:00 3041827          2000-04-17 00:00:00    2000-05-15 00:00:00
46      2000-03-30 00:00:00 4769960          2000-04-17 00:00:00    2000-05-15 00:00:00
48      2000-03-31 00:00:00 15982410         2000-04-17 00:00:00    2000-05-15 00:00:00
49      2000-04-04 00:00:00 3178232          2000-04-17 00:00:00    2000-05-15 00:00:00
51      2000-04-05 00:00:00 15571975         2000-04-17 00:00:00    2000-05-15 00:00:00
52      2000-04-06 00:00:00 2694165          2000-04-17 00:00:00    2000-05-15 00:00:00
53      2000-04-07 00:00:00 536426           2000-04-17 00:00:00    2000-05-15 00:00:00
55      2000-04-10 00:00:00 5523700          2000-04-17 00:00:00    2000-05-15 00:00:00
56      2000-04-11 00:00:00 11212425         2000-04-17 00:00:00    2000-05-15 00:00:00
58      2000-04-12 00:00:00 546223           2000-04-17 00:00:00    2000-05-15 00:00:00
61      2000-04-13 00:00:00 17913960         2000-05-15 00:00:00    2000-06-15 00:00:00
63      2000-04-17 00:00:00 3824185          2000-05-15 00:00:00    2000-06-15 00:00:00
...
1578    2006-01-03 00:00:00 4453005          2006-01-16 00:00:00    2006-02-15 00:00:00
1579    2006-01-04 00:00:00 6557373          2006-01-16 00:00:00    2006-02-15 00:00:00
1580    2006-01-05 00:00:00 2837987          2006-01-16 00:00:00    2006-02-15 00:00:00
...
1634    2006-03-14 00:00:00 31819879         2006-04-17 00:00:00    2006-05-15 00:00:00
1635    2006-03-20 00:00:00 1541321          2006-04-17 00:00:00    2006-05-15 00:00:00
1636    2006-03-21 00:00:00 47047150         2006-04-17 00:00:00    2006-05-15 00:00:00
1638    2006-03-22 00:00:00 6111712          2006-04-17 00:00:00    2006-05-15 00:00:00


Sometimes there is no 15th, since I have a range of business days. For such case I'd like to choose the record closes to 15th of a month. So, the expected outcome would be smth like:

3       2000-01-12 00:00:00  992482          2000-01-17 00:00:00    2000-02-15 00:00:00
...
61      2000-04-13 00:00:00 17913960         2000-05-15 00:00:00    2000-06-15 00:00:00
...
1634    2006-03-14 00:00:00 31819879         2006-04-17 00:00:00    2006-05-15 00:00:00

I figured out how to find every 15th of the month, that is not a very difficult:

mid_date_mask = df['date'].map(lambda x: x.day) == 15

df[mid_date_mask]

But I cannot understand how to incorporate a step back when there is no 15th in a month.

I would be very grateful for your help!

You could try if the following works for you:

mid_date_idx = (
    (df["date"].dt.day - 15).abs()
    .groupby([df["date"].dt.year, df["date"].dt.month])
    .idxmin()
    .to_list()
)
mid_date_mask = df.index.isin(mid_date_idx)
  • df is your dataframe
  • First bulid a series from column date with the absolute difference of the day and 15 .
  • Then group the result by year-month pairs, use .idxmin() to get the first index with the minimum per group, and collect them in a list mid_date_idx .
  • Build the mid_date_mask by checking if the index is in the list.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM