[英]Does linear regression work with a categorical independent variable & continuous dependent variable?
I have a dataset where: 我有一个数据集,其中:
X1 - categorical independent variable X1-分类自变量
X2 - continuous independent variable X2-连续自变量
y - continuous dependent variable y-连续因变量
And I'm looking to use X1 and X2 to predict y. 我希望使用X1和X2来预测y。 Is linear regression appropriate for this (does it even make sense to regress over a categorical independent variable?)? 线性回归是否适用于此(对分类自变量进行回归甚至有意义吗?)? If so, how can I use linear regression when X1 is a categorical independent variable (eg eye colour)? 如果是这样,当X1是分类自变量(例如,眼睛的颜色)时,如何使用线性回归?
Should I create a separate linear regression model for each of the categories in X1? 是否应该为X1中的每个类别创建一个单独的线性回归模型? Or try to create a multiple linear regression model? 还是尝试创建多元线性回归模型?
Taking a look online there are mostly resources concerning continuous independent -> continuous dependent (linear regression), or continuous independent -> categorical dependent (logistic regression). 在线查看时,大部分资源涉及连续独立->连续相关(线性回归),或连续独立->分类相关(逻辑回归)。
Would appreciate being pointed to any resources/tools that could help me. 希望能指出对我有帮助的任何资源/工具。
You can use linear regression, but you first need to first encode X1 as a series of variables. 您可以使用线性回归,但首先需要首先将X1编码为一系列变量。
Here's a simple example, using the 'dummy coding' method: 这是一个使用“虚拟编码”方法的简单示例:
┏━━━━━━━━━━━━┳━━━━━┳━━━━━┓
┃ Eye Colour ┃ x11 ┃ x12 ┃
┣━━━━━━━━━━━━╋━━━━━╋━━━━━┫
┃ Blue ┃ 0 ┃ 0 ┃
┣━━━━━━━━━━━━╋━━━━━╋━━━━━┫
┃ Brown ┃ 1 ┃ 0 ┃
┣━━━━━━━━━━━━╋━━━━━╋━━━━━┫
┃ Green ┃ 0 ┃ 1 ┃
┗━━━━━━━━━━━━┻━━━━━┻━━━━━┛
Here's an article that explains different coding methods: 这是一篇介绍不同编码方法的文章:
https://stats.idre.ucla.edu/spss/faq/coding-systems-for-categorical-variables-in-regression-analysis-2/ https://stats.idre.ucla.edu/spss/faq/coding-systems-for-categorical-variables-in-regression-analysis-2/
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.