简体   繁体   English

将每个因子具有一个答案的回归系数应用于R中数据框中每个因子的多个条目

[英]Apply regression coefficients that have one answer per factor to many entries per factor in a dataframe in R

I have a dataframe that has a column for time, symbol, price, volatility. 我有一个数据框,其中包含时间,符号,价格,波动率的列。 I use this dataframe to run a first pass OLS regression using dummy variables for the symbol 我使用此数据框使用符号的虚拟变量运行首次通过OLS回归

fit <- lm(volatility~factor(symbol) + 0

Then I want to use the coefficients from that regression in a second pass regression, so I save the coeffiecients of the regression to reuse and then I want to use that to scale volatility 然后,我想在第二遍回归中使用该回归的系数,因此我将回归的系数保存下来以供重用,然后我要使用该系数来衡量波动率

scale <- summary(fit)$coefficients[,1]
yscale <- volatility/scale
fit2 <- lm(yscale~factor(time) + factor(symbol)*factor(time) + 0

The problem that I am having is that I want to use the factor coefficients that are applicable to each symbol. 我遇到的问题是我想使用适用于每个符号的因子系数。 So in the original dataframe I want to divide the volatility by the coeffiecient that matches its symbol. 因此,在原始数据框中,我想将波动率除以与其符号匹配的系数。 So, if I have symbols, DDX, CTY, LOL then I want to divide DDX's volatility by the coefficient with factor DDX from the regression then do the same for CTY and LOL. 因此,如果我有符号DDX,CTY,LOL,那么我想用回归中的因子DDX将DDX的波动率除以系数,然后对CTY和LOL做同样的事情。 Also, I need to figure out how to do the product in the second fit2 coefficient. 另外,我需要弄清楚如何在第二个fit2系数中进行乘积运算。

You should provide a reproducible example to get an exact answers. 您应提供一个可复制的示例以获取准确的答案。 Here some data: 这里有一些数据:

dat <- data.frame(volatility= rnorm(30),
                  symbol = sample(c('DDX', 'CTY', 'LOL'),30,rep=TRUE))
fit <- lm(volatility~factor(symbol) + 0,data=dat)
mm <- coef(fit)
names(mm) <- gsub('factor\\(symbol\\)','',names(mm))

I transform the names to get a pretty names that can be used later : 我将名称转换为漂亮的名称,以便以后使用:

   CTY        DDX        LOL 
 -0.1991273  0.1331980 -0.1567511 

Then using transform , I divide each volatility with the corresponding coefficients: 然后使用transform ,将每个波动率除以相应的系数:

transform(dat,vol.scale = volatility/mm[symbol],coef = mm[symbol])
      volatility symbol    vol.scale       coef
1  -0.592306253    DDX  -4.44680974  0.1331980
2   1.143486046    DDX   8.58485769  0.1331980
3  -0.693694139    LOL   4.42544868 -0.1567511
4  -0.166050131    LOL   1.05932325 -0.1567511
5   1.381900588    CTY  -6.93978353 -0.1991273
..............................

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM