简体   繁体   English

如何在R中混合每日和季度数据?

[英]How to mix daily and quarterly data in R?

I have quarterly fundamental data from Compustat that looks like this: 我有Compustat的季度基本数据,如下所示:

fund<-data.frame(quarterlydate=as.Date(c("03/31/1966","06/30/1966"), '%m/%d/%Y'), 
               gvkey=c(1000,1000,1001,1001), tic=c("XTL", "XTL", "ABL","ABL"), 
               sales=c(70,75,20,22))

> fund
  quarterlydate gvkey tic sales
1    1966-03-31  1000 XTL    70
2    1966-06-30  1000 XTL    75
3    1966-03-31  1001 ABL    20
4    1966-06-30  1001 ABL    22

I also have daily price data from CRSP that looks like this: 我也有来自CRSP的每日价格数据,如下所示:

prices<-data.frame(dailydate=seq(as.Date("1966/01/01"), as.Date("1966/06/30"), "days"), gvkey=c(rep(1000, 181),rep(1001, 181)),
                   tic=c(rep("XTL",181), rep("ABL",181)),
                         price=floor(runif(length(seq(as.Date("1966/01/01"), as.Date("1966/06/30"), "days")), min=0, max=50)))

> head(prices)
   dailydate gvkey tic price
1 1966-01-01  1000 XTL    44
2 1966-01-02  1000 XTL    42
3 1966-01-03  1000 XTL    42
4 1966-01-04  1000 XTL    16
5 1966-01-05  1000 XTL    27
6 1966-01-06  1000 XTL    36

> tail(prices)
     dailydate gvkey tic price
357 1966-06-25  1001 ABL     0
358 1966-06-26  1001 ABL    28
359 1966-06-27  1001 ABL     4
360 1966-06-28  1001 ABL    18
361 1966-06-29  1001 ABL    49
362 1966-06-30  1001 ABL     4

Question: 题:

1) How can I merge such a quarterly and daily datasets to have a dataframe like the one below? 1)如何合并这样的季度和每日数据集,使其具有以下数据框?

2) How can I calculate the average quarterly price and assign the values to the quarters? 2)如何计算平均季度价格并将值分配给季度? (the "average_quarterly_price" variable in the table below) (下表中的“ average_quarterly_price”变量)

I want a merged data frame like this: 我想要这样的合并数据框:

 dailydate   quarterlydates gvkey tic  price  sales average_quarterly_price
1 1966-01-01   1966-03-31    1000  XTL     1    70   32
2 1966-01-02   1966-03-31    1000  XTL    10    70   32
3 1966-01-03   1966-03-31    1000  XTL    14    70   32
4 1966-01-04   1966-03-31    1000  XTL    29    70   32
5 1966-01-05   1966-03-31    1000  XTL     1    70   32
6 1966-01-06   1966-03-31    1000  XTL    43    70   32
.
.
.
182 1966-04-01  1966-06-31   1000  XTL    11    75   41
183 1966-04-02  1966-06-31   1000  XTL    8     75   41 
184 1966-04-03  1966-06-31   1000  XTL    16    75   41
185 1966-04-04  1966-06-31   1000  XTL    14    75   41
186 1966-04-05  1966-06-31   1000  XTL    14    75   41
187 1966-04-06  1966-06-31   1000  XTL    20    75   41
.
.
.
364 1966-01-01  1966-03-31   1001  ABL    18    20   15
365 1966-01-02  1966-03-31   1001  ABL    10    20   15
366 1966-01-03  1966-03-31   1001  ABL    13    20   15
367 1966-01-04  1966-03-31   1001  ABL    13    20   15
368 1966-01-05  1966-03-31   1001  ABL    11    20   15
369 1966-01-06  1966-03-31   1001  ABL    13    20   15
.
.
.
545 1966-04-01  1966-06-31   1001  ABL    14    22   16
555 1966-04-02  1966-06-31   1001  ABL    21    22   16 
556 1966-04-03  1966-06-31   1001  ABL    18    22   16
557 1966-04-04  1966-06-31   1001  ABL    18    22   16
558 1966-04-05  1966-06-31   1001  ABL    17    22   16
559 1966-04-06  1966-06-31   1001  ABL    18    22   16
.
.
.
724 1966-06-31  1966-06-31  1001  ABL    22    22   16

Of course I am not sure if this is the best dataset format, and would appreciate suggestions. 当然,我不确定这是否是最好的数据集格式,并希望提出建议。 My ultimate purpose is to be able to use both daily and quarterly data in a single analysis. 我的最终目的是能够在单个分析中同时使用每日和季度数据。 For the sake of example, I want to be able to to find stocks with quarterly Return on Assets in the top 20% percentile AND whose daily prices have been lowest in the last 10 days. 举例来说,我希望能够找到季度资产收益率最高的20%百分数,并且其每日价格在过去10天中最低的股票。

Create a "yearqtr" class column in each data frame and then perform a left join of the two data frames using common column names. 在每个数据框中创建一个"yearqtr"类列,然后使用公共列名对两个数据框执行左连接。 Finally use ave to calculate the mean. 最后使用ave计算平均值。

library(zoo) # yearqtr class

fundq <- transform(fund, yearqtr = as.yearqtr(quarterlydate))
pricesq <- transform(prices, yearqtr = as.yearqtr(dailydate))
m <- merge(pricesq, fundq, all.x = TRUE)
transform(m, avg_price = ave(price, tic, yearqtr))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM