简体   繁体   中英

Multiple Beta Calculation using Linear Regression in R

I have the data in an excel file as per the below format

  Y       X1         X2     X3      X4      X5      X6
0.3236% 0.2561% 0.3302% 0.2800% 0.2886% 0.2363% 0.2755%
0.4547% 0.3860% 0.4673% 0.4626% 0.4407% 0.3966% 0.4460%
0.3820% 0.3193% 0.3882% 0.3910% 0.3333% 0.3307% 0.3485%
0.3951% 0.3190% 0.3991% 0.3506% 0.3594% 0.3230% 0.3692%
0.4460% 0.4047% 0.4566% 0.3841% 0.4125% 0.3561% 0.4319%
0.4525% 0.4163% 0.4629% 0.4142% 0.4000% 0.3871% 0.4357%
0.3680% 0.4011% 0.3759% 0.3890% 0.4193% 0.4802% 0.3490%
0.4304% 0.2657% 0.4224% 0.4619% 0.4936% 0.3776% 0.2827%
0.1360% 0.1866% 0.1351% 0.1694% 0.1853% 0.1316% 0.1649%
0.1317% 0.1335% 0.1276% 0.1682% 0.1960% 0.1318% 0.1356%
0.2713% 0.4491% 0.2891% 0.1901% 0.3513% 0.1816% 0.3869%
0.2404% 0.2389% 0.2371% 0.2217% 0.2162% 0.1827% 0.2571%
0.4934% 0.4529% 0.5047% 0.4766% 0.3890% 0.4124% 0.4610%
0.4083% 0.4513% 0.4128% 0.3612% 0.3974% 0.3759% 0.4667%
0.3033% 0.3063% 0.3058% 0.3342% 0.2688% 0.3286% 0.3019%
0.2976% 0.3226% 0.2967% 0.2697% 0.2626% 0.2860% 0.3172%
0.2505% 0.3238% 0.2554% 0.2682% 0.2495% 0.3014% 0.2931%
0.2077% 0.2491% 0.2019% 0.1866% 0.2063% 0.2065% 0.1928%
0.3669% 0.3316% 0.3703% 0.3034% 0.2806% 0.3556% 0.3310%

I need to run a linear regression as

Y = M1*X1 + C 

as the 1st linear regression and store the calculated value of M1 in a table say T

Then in the next step, I need to run the 2nd linear regression as

Y = M2*X2 + C 

and store the calculated value of M2 in the table T

Repeat the process and calculate the values till M6 and store all the six values in the table T

Financial Beta calculation required for research

Attaching the data file in the below location for reference.

https://drive.google.com/file/d/1lam3tA9iPXJvqtEO-89vmbVIkxYf3sG3/view?usp=sharing

You're probably looking for this. I've simulated some data since you've provided no M* columns.

d[] <- lapply(d, function(x) as.numeric(gsub("%", "", x)))  # cleans data from "%" beforehand

T <- sapply(1:6, function(i) {
  FO <- reformulate(paste0("M", i, "*", "X", i), response="y")
  return(coef(lm(FO, d))[2])
})
T
#         M1         M2         M3         M4         M5         M6 
# -0.6628572 -0.8423720  0.4183318  0.5154114  2.5278650  1.1081990 

reformulate creates a FO rmula frame which is feeded with the values 1 to 6 by the sapply , so that always the same digit is added both to M and to X . The return fits the model with lm and already gives the desired coefficient. Overall the sapply also throws the desired vector T .

Data

d <- structure(list(y = c("0.9148%", "0.9371%", "0.2861%", "0.8304%", 
"0.6417%", "0.5191%", "0.7366%", "0.1347%", "0.657%", "0.7051%"
), X1 = c("0.4577%", "0.7191%", "0.9347%", "0.2554%", "0.4623%", 
"0.94%", "0.9782%", "0.1175%", "0.475%", "0.5603%"), X2 = c("0.904%", 
"0.1387%", "0.9889%", "0.9467%", "0.0824%", "0.5142%", "0.3902%", 
"0.9057%", "0.447%", "0.836%"), X3 = c("0.7376%", "0.8111%", 
"0.3881%", "0.6852%", "0.0039%", "0.8329%", "0.0073%", "0.2077%", 
"0.9066%", "0.6118%"), X4 = c("0.3796%", "0.4358%", "0.0374%", 
"0.9735%", "0.4318%", "0.9576%", "0.8878%", "0.64%", "0.971%", 
"0.6188%"), X5 = c("0.3334%", "0.3467%", "0.3985%", "0.7847%", 
"0.0389%", "0.7488%", "0.6773%", "0.1713%", "0.2611%", "0.5144%"
), X6 = c("0.6756%", "0.9828%", "0.7595%", "0.5665%", "0.8497%", 
"0.1895%", "0.2713%", "0.8282%", "0.6932%", "0.2405%"), M1 = c("0.043%", 
"0.1405%", "0.2164%", "0.4794%", "0.1974%", "0.7194%", "0.0079%", 
"0.3755%", "0.5144%", "0.0016%"), M2 = c("0.5816%", "0.1579%", 
"0.359%", "0.6456%", "0.7758%", "0.5636%", "0.2337%", "0.09%", 
"0.0856%", "0.3052%"), M3 = c("0.6674%", "2e-04%", "0.2086%", 
"0.933%", "0.9256%", "0.7341%", "0.3331%", "0.5151%", "0.744%", 
"0.6192%"), M4 = c("0.6262%", "0.2172%", "0.2166%", "0.3889%", 
"0.9425%", "0.9626%", "0.7399%", "0.7332%", "0.5358%", "0.0023%"
), M5 = c("0.6089%", "0.8368%", "0.7515%", "0.4527%", "0.5358%", 
"0.5374%", "0.0014%", "0.3557%", "0.6121%", "0.8289%"), M6 = c("0.3567%", 
"0.4106%", "0.5735%", "0.5897%", "0.7197%", "0.395%", "0.9192%", 
"0.9626%", "0.2335%", "0.7245%")), row.names = c(NA, -10L), class = "data.frame")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM