简体   繁体   English

简单线性回归 lm 函数 R

[英]Simple Linear Regression lm function R

I've read some tutorial about the lm() function in R and I am a little bit confuse about how this function deal with continuous or discrete predictors.我已经阅读了一些关于 R 中 lm() 函数的教程,我对这个函数如何处理连续或离散预测变量有点困惑。 In https://www.r-bloggers.com/r-tutorial-series-simple-linear-regression/ , for continuous labels, the coefficients represent the intercept and the slope of the linear regression.https://www.r-bloggers.com/r-tutorial-series-simple-linear-regression/ 中,对于连续标签,系数表示线性回归的截距和斜率。

在此处输入图片说明

This is clear, but if now I have a category of gender, where values are 0 or 1, how does the lm() function work.这很清楚,但如果现在我有一个性别类别,其中值为 0 或 1,那么 lm() 函数如何工作。 Does the function apply a logistic regression or is it still possible to use the function in this way.该函数是否应用逻辑回归,或者是否仍然可以以这种方式使用该函数。

Your the answer you are looking for is unclear from your question.您正在寻找的答案从您的问题中不清楚。 Yes, you can use the lm function with a categorical variables.是的,您可以将lm函数与分类变量一起使用。 The resultant equation is the sum of two linear fits.结果方程是两个线性拟合的总和。

It is best to illustrate with an example.最好用一个例子来说明。 Using made up data:使用合成数据:

x <- seq(1:10)
y1<- x+rnorm(10, 0, 0.1)
y2<- 14-x+rnorm(10, 0, 0.1)
f<-rep(c("A", "B"), each=10)
df<-data.frame(x=c(x,x), y=c(y1, y2), f)

#Model 1
print(lm(y1~x))

#   lm(formula = y1 ~ x)
# 
# Coefficients:
# (Intercept)            x  
#      0.1703       0.9754 


#Model 2
model<-lm(y~x*f, data=df)
print(model)

#   lm(formula = y ~ x * f, data = df)
# 
# Coefficients:
#(Intercept)            x           fB         x:fB  
#     0.1703       0.9754      13.7622      -1.9709  


#Model 3
print(lm(y2~x))

#   lm(formula = y2 ~ x)
# 
# Coefficients:
# (Intercept)            x  
#     13.9325      -0.9955 

After running the code above and comparing the Model 1 and 2, you can see how the intercept and the x slope are the same.运行上面的代码并比较模型 1 和模型 2 后,您可以看到截距和 x 斜率是如何相同的。 This is because the when it is factor A (ie 0 or absence), fb and x:fb are 0 and drops out.这是因为当它是因素 A 时(即 0 或不存在),fb 和 x:fb 为 0 并退出。 When the factor is B then fb and x:fb are actual values and are additive to the model.当因子为 B 时,fb 和 x:fb 是实际值并且可以添加到模型中。

If you add the intercept and fb together and add the x slope to x:fb the results will be the slope and intercept of model 3.如果将截距和 fb 相加,并将 x 斜率与 x:fb 相加,则结果将是模型 3 的斜率和截距。

I hope this helps and did not cloud your understanding.我希望这会有所帮助,并且不会影响您的理解。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM