简体   繁体   中英

Delete first calendar month of each group in data.table

I have daily observations of share returns in one large data.table. I am a complete novice to data.table trying to learn it. I would like to exclude the first calendar month of observations from each of the share returns. The first calendar month is not always a full month, sometimes it might begin in the middle of the month. So for example I have the following:

> priceDT
          Date      Return Share
 1: 2014-09-25  0.06535000   ACG
 2: 2014-09-26  0.08786100   ACG
 3: 2014-09-29  0.22501097   ACG
 4: 2014-09-30 -0.03802740   ACG
 5: 2014-10-01 -0.07398872   ACG
 6: 2014-10-02 -0.11121113   ACG
 7: 2016-04-25  0.04110000   FSR
 8: 2016-04-26 -0.06731000   FSR
 9: 2016-04-29  0.01374000   FSR
10: 2016-04-30 -0.02140000   FSR
11: 2016-05-01  0.01188000   FSR
12: 2016-05-02  0.01300000   FSR

I want to remove September and April from the ACG and FSR share returns respectively. Which would give the following output:

outcome
         Date      Return Share
1: 2014-10-01 -0.07398872   ACG
2: 2014-10-02 -0.11121113   ACG
3: 2016-05-01  0.01188000   FSR
4: 2016-05-02  0.01300000   FSR

How would this be done in data.table?

DATA:

library(data.table)

priceDT <- fread("Date, Return, Share
                 2014-09-25,0.06535,ACG
                 2014-09-26,0.087861,ACG
                 2014-09-29,0.22501097,ACG
                 2014-09-30,-0.0380274,ACG
                 2014-10-01,-0.07398872,ACG
                 2014-10-02,-0.11121113,ACG
                 2016-04-25,0.0411,FSR
                 2016-04-26,-0.06731,FSR
                 2016-04-29,0.01374,FSR
                 2016-04-30,-0.0214,FSR
                 2016-05-01,.01188,FSR
                 2016-05-02,0.013,FSR
                 ")

outcome <- fread("Date, Return, Share
                 2014-10-01,-0.07398872,ACG
                 2014-10-02,-0.11121113,ACG
                 2016-05-01,.01188,FSR
                 2016-05-02,0.013,FSR
                 ")

You could exclude observations that share month and year of first Date for each Share :

library(data.table)
priceDT[, .SD[format(Date, '%Y-%m') != format(first(Date), '%Y-%m')], Share]

#   Share       Date      Return
#1:   ACG 2014-10-01 -0.07398872
#2:   ACG 2014-10-02 -0.11121113
#3:   FSR 2016-05-01  0.01188000
#4:   FSR 2016-05-02  0.01300000
priceDT[, .SD[rleid(month(Date)) != 1L], by = Share]

如果数据尚未排序,则应添加order(Date)

priceDT[order(Date), .SD[rleid(month(Date)) != 1L], by = Share]

An option with dplyr

library(dplyr)
library(zoo)
priceDT %>%
      group_by(Share) %>%
      filter(as.yearmon(Date) != as.yearmon(first(Date)))
    

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM