[英]How to mix daily and quarterly data in R?
I have quarterly fundamental data from Compustat that looks like this: 我有Compustat的季度基本数据,如下所示:
fund<-data.frame(quarterlydate=as.Date(c("03/31/1966","06/30/1966"), '%m/%d/%Y'),
gvkey=c(1000,1000,1001,1001), tic=c("XTL", "XTL", "ABL","ABL"),
sales=c(70,75,20,22))
> fund
quarterlydate gvkey tic sales
1 1966-03-31 1000 XTL 70
2 1966-06-30 1000 XTL 75
3 1966-03-31 1001 ABL 20
4 1966-06-30 1001 ABL 22
I also have daily price data from CRSP that looks like this: 我也有来自CRSP的每日价格数据,如下所示:
prices<-data.frame(dailydate=seq(as.Date("1966/01/01"), as.Date("1966/06/30"), "days"), gvkey=c(rep(1000, 181),rep(1001, 181)),
tic=c(rep("XTL",181), rep("ABL",181)),
price=floor(runif(length(seq(as.Date("1966/01/01"), as.Date("1966/06/30"), "days")), min=0, max=50)))
> head(prices)
dailydate gvkey tic price
1 1966-01-01 1000 XTL 44
2 1966-01-02 1000 XTL 42
3 1966-01-03 1000 XTL 42
4 1966-01-04 1000 XTL 16
5 1966-01-05 1000 XTL 27
6 1966-01-06 1000 XTL 36
> tail(prices)
dailydate gvkey tic price
357 1966-06-25 1001 ABL 0
358 1966-06-26 1001 ABL 28
359 1966-06-27 1001 ABL 4
360 1966-06-28 1001 ABL 18
361 1966-06-29 1001 ABL 49
362 1966-06-30 1001 ABL 4
Question: 题:
1) How can I merge such a quarterly and daily datasets to have a dataframe like the one below? 1)如何合并这样的季度和每日数据集,使其具有以下数据框?
2) How can I calculate the average quarterly price and assign the values to the quarters? 2)如何计算平均季度价格并将值分配给季度? (the "average_quarterly_price" variable in the table below) (下表中的“ average_quarterly_price”变量)
I want a merged data frame like this: 我想要这样的合并数据框:
dailydate quarterlydates gvkey tic price sales average_quarterly_price
1 1966-01-01 1966-03-31 1000 XTL 1 70 32
2 1966-01-02 1966-03-31 1000 XTL 10 70 32
3 1966-01-03 1966-03-31 1000 XTL 14 70 32
4 1966-01-04 1966-03-31 1000 XTL 29 70 32
5 1966-01-05 1966-03-31 1000 XTL 1 70 32
6 1966-01-06 1966-03-31 1000 XTL 43 70 32
.
.
.
182 1966-04-01 1966-06-31 1000 XTL 11 75 41
183 1966-04-02 1966-06-31 1000 XTL 8 75 41
184 1966-04-03 1966-06-31 1000 XTL 16 75 41
185 1966-04-04 1966-06-31 1000 XTL 14 75 41
186 1966-04-05 1966-06-31 1000 XTL 14 75 41
187 1966-04-06 1966-06-31 1000 XTL 20 75 41
.
.
.
364 1966-01-01 1966-03-31 1001 ABL 18 20 15
365 1966-01-02 1966-03-31 1001 ABL 10 20 15
366 1966-01-03 1966-03-31 1001 ABL 13 20 15
367 1966-01-04 1966-03-31 1001 ABL 13 20 15
368 1966-01-05 1966-03-31 1001 ABL 11 20 15
369 1966-01-06 1966-03-31 1001 ABL 13 20 15
.
.
.
545 1966-04-01 1966-06-31 1001 ABL 14 22 16
555 1966-04-02 1966-06-31 1001 ABL 21 22 16
556 1966-04-03 1966-06-31 1001 ABL 18 22 16
557 1966-04-04 1966-06-31 1001 ABL 18 22 16
558 1966-04-05 1966-06-31 1001 ABL 17 22 16
559 1966-04-06 1966-06-31 1001 ABL 18 22 16
.
.
.
724 1966-06-31 1966-06-31 1001 ABL 22 22 16
Of course I am not sure if this is the best dataset format, and would appreciate suggestions. 当然,我不确定这是否是最好的数据集格式,并希望提出建议。 My ultimate purpose is to be able to use both daily and quarterly data in a single analysis. 我的最终目的是能够在单个分析中同时使用每日和季度数据。 For the sake of example, I want to be able to to find stocks with quarterly Return on Assets in the top 20% percentile AND whose daily prices have been lowest in the last 10 days. 举例来说,我希望能够找到季度资产收益率最高的20%百分数,并且其每日价格在过去10天中最低的股票。
Create a "yearqtr"
class column in each data frame and then perform a left join of the two data frames using common column names. 在每个数据框中创建一个"yearqtr"
类列,然后使用公共列名对两个数据框执行左连接。 Finally use ave
to calculate the mean. 最后使用ave
计算平均值。
library(zoo) # yearqtr class
fundq <- transform(fund, yearqtr = as.yearqtr(quarterlydate))
pricesq <- transform(prices, yearqtr = as.yearqtr(dailydate))
m <- merge(pricesq, fundq, all.x = TRUE)
transform(m, avg_price = ave(price, tic, yearqtr))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.