My dataset is quite big so I'm just using 10 lines of data as an example (I've worked out the answer in excel but can't replicate it in R-as i need help with the code):
constant<-c(6.10,5.12,5.04,4.97,4.89,4.89,4.87,4.87,4.88,4.99)
years.star<-c(219.87,153.69,146.19,139.35,127.27,127.27,121.91,121.91,112.28,99.98)
years.sq.star<-c(7915.41,4610.71,4239.78,3901.93,3309.27,3309.27,3047.95,3047.95,2582.58,1999.62)
ln.salary<-c(28.43,23.12,21.59,21.44,22.71,23.33,20.29,21.76,21.48,22.92)
try<-data.frame(constant,years.star,years.sq.star,ln.salary)
Ln.salary is the dependant variable. The answer you should get is:
intercept- 6.474922
beta1- -0.15026
beta2- 0.002769
My problem is that in R, if I use the lm function, it does not know that my intercept has the values above. it just uses 1,1,1,1,1,1,1,1,1,1 instead of 6.10,5.12,etc
So test<-lm(ln.salary~years.star+years.sq.star,data=try,weights=constant)
does not work because it will just generate this answer:
intercept- 207.1706
beta1- -3.13214
beta2- 0.064416
In essence, I've taken data and tried to adjust for heteroscedasticity. In the final step, I have my constant star and my transformed x variables. The last step is to regress ln.salary on the constant and x variables to give me the answer you should get as per above.
I can do it in excel but not in R and I know I'm not getting the code right. I know the lm function which generates intercept (1,1,1...) is the problem. Please would you help.
Kind regards D
If you want to "fix" an intercept at a particular constant, you should subtract the value of that constant from the response, and then fit a no-intercept model. For example
test <- lm( ln.salary - 6.474922 ~ years.star + years.sq.star + 0,
data=try, weights=constant)
Here we subtract off the intercept term, and then we add +0
to the formula to indicate not to fit an intercept term. With that model I get
Call:
lm(formula = ln.salary - 6.474922 ~ years.star + years.sq.star +
0, data = try, weights = constant)
Coefficients:
years.star years.sq.star
0.197384 -0.002842
If you want varying "intercepts" for each row, then you need to use an 'offset' rather than a 'weight':
test<-lm(ln.salary~years.star+years.sq.star+0,data=try,offset=constant)
Call:
lm(formula = ln.salary ~ years.star + years.sq.star + 0, data = try,
offset = constant)
Coefficients:
years.star years.sq.star
0.236355 -0.003881
I'm not so impressed with the fact that this doesn't agree with Excel. That program's linear regression program is known to be rather flakey. If on the other hand you are sure you need to use weights, then you should clarify which of the three different possible interpretations of the term is being used. (Choices: replication, sampling, inverse variance). The lm
interpretation of a "weight" is the inverse variance version. (It is described in its help page as being "inversely proportional to the variance), so if those "constant"-terms are variances, then perhaps you want:
> (test<-lm(ln.salary~years.star+years.sq.star+0, data=try, weights=1/constant) )
Call:
lm(formula = ln.salary ~ years.star + years.sq.star + 0, data = try,
weights = 1/constant)
Coefficients:
years.star years.sq.star
0.309391 -0.005189
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.