Matlab / R-具有分类和连续预测变量的线性回归-为什么连续预测变量平方？

Question

I'm doing a linear regression using categorical predictors and a 0 to 1 numerical outcome. 我正在使用分类预测变量和0到1的数字结果进行线性回归。 On this page I saw it suggested to square a numerical predictor when it is alongside a nominal on (see third section on Linear Regression with Categorical Predictor ). 在此页面上，我看到它建议在数值预测变量与标称符号并排时对其求平方（请参阅关于Linear Regression with Categorical Predictor变量的Linear Regression with Categorical Predictor第三部分）。 The example they give (for Matlab, but this generalizes to R as well) is the following formula where weight is continuous and year is nominal: 他们给出的示例（对于Matlab，但这也适用于R）是以下公式，其中weight是连续的， year是标称的：

mdl = fitlm(tbl,'MPG ~ Year + Weight^2')

Is this a universal rule? 这是普遍规则吗？ When I do it, I do get much stronger coefficients but I want to make sure I'm not inflating them without warrant. 当我这样做时，我确实得到了更强的系数，但是我想确保我不会在没有认股权证的情况下夸大它们。 Could someone explain the logic of using .^ for numericals alongside categoricals? 有人可以解释使用.^和数字一起使用数字的逻辑吗？

Answer 1

If you graph mpg vs. weight for each year separately and you see curvature then a polynomial in weight might help correct for the non-linearity. 如果分别绘制每年的mpg与重量的关系图，并且看到曲率，则权重的多项式可能有助于校正非线性。

library(lattice)

u <- "https://raw.githubusercontent.com/shifteight/R/master/ISLR/Auto.csv"
Cars <- read.csv(u)

o <- with(Cars, order(year, weight))
xyplot(mpg ~ weight | year, Cars[o, ], type = c("p", "smooth"))

Matlab / R-具有分类和连续预测变量的线性回归-为什么连续预测变量平方？

问题描述

1 个解决方案

解决方案1
4 已采纳 2017-12-10 00:22:32

Matlab / R-具有分类和连续预测变量的线性回归-为什么连续预测变量平方？

问题描述

1 个解决方案

解决方案1 4 已采纳 2017-12-10 00:22:32

解决方案1
4 已采纳 2017-12-10 00:22:32