[英]Converting and mapping categorical data
I have a dataset which has categoric and numeric columns.我有一个包含分类和数字列的数据集。 I want to convert the categoric data to numeric together with mapping each type of category to a specific numeric value.
我想将分类数据转换为数字,并将每种类型的类别映射到特定的数值。 For example, under column ['Education'], I have Highschool, Undergraduate, Graduate, PHD etc. I'd appreciate if someone could provide me the code to map each code to an arbitrary numeric value.
例如,在 ['Education'] 列下,我有高中、本科、研究生、博士等。如果有人可以向我提供 map 的代码,我将不胜感激,每个代码都是任意数值。
import pandas as pd
df = pd.DataFrame(["Highschool", "Undergraduate","Highschool" ,"Graduate", "PHD", "Graduate", "Graduate","Undergraduate"],columns = ["Education"])
df_transformed = pd.get_dummies(df)
df_transformed.head()
OP:操作:
Education_Graduate Education_Highschool Education_PHD Education_Undergraduate
0 0 1 0 0
1 0 0 0 1
2 0 1 0 0
3 1 0 0 0
4 0 0 1 0
#Label Encoding #标签编码
from sklearn import preprocessing
encoder = preprocessing.LabelEncoder()
encoder.fit(df["Education"].values)
#use_any_input_list_here and it will assign a numerical value. I have given a sample list
encoder.transform(["Undergraduate","Highschool" ,"Graduate"])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.