简体   繁体   English

线性回归是否适用于分类自变量和连续因变量?

[英]Does linear regression work with a categorical independent variable & continuous dependent variable?

I have a dataset where: 我有一个数据集,其中:

X1 - categorical independent variable X1-分类自变量

X2 - continuous independent variable X2-连续自变量

y - continuous dependent variable y-连续因变量

And I'm looking to use X1 and X2 to predict y. 我希望使用X1和X2来预测y。 Is linear regression appropriate for this (does it even make sense to regress over a categorical independent variable?)? 线性回归是否适用于此(对分类自变量进行回归甚至有意义吗?)? If so, how can I use linear regression when X1 is a categorical independent variable (eg eye colour)? 如果是这样,当X1是分类自变量(例如,眼睛的颜色)时,如何使用线性回归?

Should I create a separate linear regression model for each of the categories in X1? 是否应该为X1中的每个类别创建一个单独的线性回归模型? Or try to create a multiple linear regression model? 还是尝试创建多元线性回归模型?

Taking a look online there are mostly resources concerning continuous independent -> continuous dependent (linear regression), or continuous independent -> categorical dependent (logistic regression). 在线查看时,大部分资源涉及连续独立->连续相关(线性回归),或连续独立->分类相关(逻辑回归)。

Would appreciate being pointed to any resources/tools that could help me. 希望能指出对我有帮助的任何资源/工具。

You can use linear regression, but you first need to first encode X1 as a series of variables. 您可以使用线性回归,但首先需要首先将X1编码为一系列变量。

Here's a simple example, using the 'dummy coding' method: 这是一个使用“虚拟编码”方法的简单示例:

┏━━━━━━━━━━━━┳━━━━━┳━━━━━┓
┃ Eye Colour ┃ x11 ┃ x12 ┃
┣━━━━━━━━━━━━╋━━━━━╋━━━━━┫
┃ Blue       ┃  0  ┃  0  ┃
┣━━━━━━━━━━━━╋━━━━━╋━━━━━┫
┃ Brown      ┃  1  ┃  0  ┃
┣━━━━━━━━━━━━╋━━━━━╋━━━━━┫
┃ Green      ┃  0  ┃  1  ┃
┗━━━━━━━━━━━━┻━━━━━┻━━━━━┛

Here's an article that explains different coding methods: 这是一篇介绍不同编码方法的文章:

https://stats.idre.ucla.edu/spss/faq/coding-systems-for-categorical-variables-in-regression-analysis-2/ https://stats.idre.ucla.edu/spss/faq/coding-systems-for-categorical-variables-in-regression-analysis-2/

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何比较连续因变量和分类自变量? - How to compare the continuous dependent variable with categorical independent variable? 所有自变量都是分类的,因(目标)变量是连续的 - All independent variables are categorical and dependent(target) variable is continuous 如何在python中从多个自变量和一个因变量绘制图[多重线性回归] - How to Plot graph from multiple independent variable and one dependent variable in python [Multiple linear regression] 多项式Lo​​gistic回归中的分类因变量 - Categorical dependent variable in Multinomial Logistic Regression 目标因变量是连续的,但自变量是分类的 - Target Dependent Variables is continuous but Independent Variables are Categorical 通过每个数值自变量和目标变量进行线性回归和绘图 - Linear regression and plots through each numerical independent variable and target variable 如果我有两个自变量和一个因变量,如何在多个线性回归中绘制最佳拟合线 - How can I plot best fit line in multiple Linear Regression if I have two independent variables and one dependent variable 如何传递单个列因变量来训练线性回归模型? - How to pass the single columnar dependent variable to train the linear regression model? 使用带有numpy的多元线性回归计算因变量的值 - Calculate the values of dependent variable using multivariate linear regression with numpy 带一个变量的线性回归 - Linear Regression with one variable
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM