I'm doing a linear regression using categorical predictors and a 0 to 1 numerical outcome. On this page I saw it suggested to square a numerical predictor when it is alongside a nominal on (see third section on Linear Regression with Categorical Predictor
). The example they give (for Matlab, but this generalizes to R as well) is the following formula where weight
is continuous and year
is nominal:
mdl = fitlm(tbl,'MPG ~ Year + Weight^2')
Is this a universal rule? When I do it, I do get much stronger coefficients but I want to make sure I'm not inflating them without warrant. Could someone explain the logic of using .^
for numericals alongside categoricals?
If you graph mpg vs. weight for each year separately and you see curvature then a polynomial in weight might help correct for the non-linearity.
library(lattice)
u <- "https://raw.githubusercontent.com/shifteight/R/master/ISLR/Auto.csv"
Cars <- read.csv(u)
o <- with(Cars, order(year, weight))
xyplot(mpg ~ weight | year, Cars[o, ], type = c("p", "smooth"))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.