简体   繁体   English

我需要删除特征之间的多重共线性

[英]I need to remove multi co-linearity between features

I have categorical variables such as Gender,Anxiety,Alcoholic and when i convert these categorical variables into numerical values using encoder techniques then all these variables resembles same in values and then multi co linearity is existing. 我有分类变量,例如Gender,Anxiety,Alcoholic,当我使用编码器技术将这些分类变量转换为数值时,所有这些变量的值相似,然后存在多重共线性。 How i can convert these variables to number so that multi co linearity doesn't exist. 我如何将这些变量转换为数字,以便不存在多重共线性。 All three variables are important for prediction of target variable. 所有这三个变量对于目标变量的预测都很重要。

You don't need to transform the data.Instead you can change the way that you are calculating correlation between variables. 您无需转换数据,而是可以更改计算变量之间相关性的方式。 As these are categorical features, you have to use Chi-Squared test of independence.Then, you won't be facing this issue. 由于这些是分类功能,因此您必须使用Chi-Squared独立性测试,然后您将不会遇到此问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 线性回归如何处理虚拟变量的共线性? - How to deal with co-linearity of dummy variables for linear regression? 解释数据处理中的“多重共线性” - Explain 'Multi-col-linearity' in data processing 如何在 Python 中测试线性(叠加)和平移不变性? - How can I test linearity (superposition) & shift-invariance in Python? 在Python中,是否需要保护多线程进程之间的数据传输? - In Python, do I need to protect data transfer between multi-threaded processes? 给定形状为(num_samples,num_features)的稀疏矩阵,我如何估计共现矩阵? - Given a sparse matrix with shape (num_samples, num_features), how do I estimate the co-occurrence matrix? 这个项目需要多线程吗? - Do I need multi-threading for this project? 我可以删除正则表达式中除了这一件事之外的所有内容:“co-”。 不知道如何处理破折号 - I can remove everything except this one thing in regex: "co-". Not sure how to deal with the dash SimpleImputer 是否删除功能? - Does SimpleImputer remove features? 如何计算高分辨率图像之间的匹配特征? - How to I compute matching features between high resolution images? Python中的多入口,多出口协同例程 - Multi-entry, multi-exit co-routine in Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM