[英]I want to convert the categorical variable to numerical in Python
I have a dataframe having categorical variables.我有一个包含分类变量的数据框。 I want to convert them to the numerical using the following logic:
我想使用以下逻辑将它们转换为数字:
I have 2 lists one contains the distinct categorical values in the column and the second list contains the values for each category.我有 2 个列表,一个包含列中不同的分类值,第二个列表包含每个类别的值。 Now i need to map these values in place of those categorical values.
现在我需要映射这些值来代替那些分类值。
For Eg:例如:
List_A = ['A','B','C','D','E']
List_B = [3,2,1,1,2]
I need to replace A with 3, B with 2, C and D with 1 and E with 2.我需要用 3 替换 A,用 2 替换 B,用 1 替换 C 和 D,用 2 替换 E。
Is there any way to do this in Python.有什么办法可以在 Python 中做到这一点。
I can do this by applying multiple for loops but I am looking for some easier way or some direct function if there is any.我可以通过应用多个 for 循环来做到这一点,但我正在寻找一些更简单的方法或一些直接的函数(如果有的话)。
Any help is very much appreciated, Thanks in Advance.非常感谢任何帮助,提前致谢。
Create a mapping dict创建映射字典
List_A = ['A','B','C','D','E',]
List_B = [3,2,1,1,2]
d=dict(zip(List_A, List_B))
new_list=['A','B','C','D','E','A','B']
new_mapped_list=[d[v] for v in new_list if v in d]
new_mapped_list
Or define a function and use map或者定义一个函数并使用map
List_A = ['A','B','C','D','E',]
List_B = [3,2,1,1,2]
d=dict(zip(List_A, List_B))
def mapper(value):
if value in d:
return d[value]
return None
new_list=['A','B','C','D','E','A','B']
map(mapper,new_list)
Suppose df is your data frame and "Category" is the name of the column holding your categories:假设 df 是您的数据框,而“类别”是包含您的类别的列的名称:
df[df.Category == "A"] = 3,2, 1, 1, 2
df[(df.Category == "B") | (df.Category == "E") ] = 2
df[(df.Category == "C") | (df.Category == "D") ] = 1
If you only need to replace values in one list with the values of other and the structure is like the one you say.如果您只需要将一个列表中的值替换为另一个列表中的值,并且结构就像您说的那样。 Two list, same lenght and same position, then you only need this:
两个列表,相同的长度和相同的位置,那么你只需要这个:
list_a = []
list_a = list_b
A more convoluted solution would be like this, with a function that will create a dictionary that you can use on other lists:一个更复杂的解决方案是这样的,它有一个函数可以创建一个可以在其他列表上使用的字典:
# we make a function
def convert_list(ls_a,ls_b):
dic_new = {}
for letter,number in zip(ls_a,ls_b):
dic_new[letter] = number
return dic_new
This will make a dictionary with the combinations you need.这将制作一本包含您需要的组合的字典。 You pass the two list, then you can use that dictionary on other list:
您传递两个列表,然后您可以在其他列表上使用该字典:
List_A = ['A','B','C','D','E']
List_B = [3,2,1,1,2]
dic_new = convert_list(ls_a, ls_b)
other_list = ['a','b','c','d']
for _ in other_list:
print(dic_new[_.upper()])
# prints
3
2
1
1
cheers干杯
You could use a solution from machine learning scikit-learn module.您可以使用机器学习 scikit-learn 模块中的解决方案。
OneHotEncoder
LabelEncoder
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html
The pandas "hard" way:熊猫的“硬”方式:
https://stackoverflow.com/a/29330853/9799449 https://stackoverflow.com/a/29330853/9799449
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.