简体   繁体   English

熊猫的分类变量

[英]categorical variables with pandas

When loading a csv file that looks like this 加载如下所示的csv文件时

0 1 male 3 4 5 6
1 0 female 6 7 8 9
.....

is it possible to automatically convert the third column to integers, for example 0 for male and 1 for female? 是否可以自动将第三列转换为整数,例如,男性为0,女性为1?

read_csv accepts an argument named converters . read_csv接受一个名为converters的参数。 This can be used to apply functions to particular columns as a file is read in. converters should be passed in as a dictionary of the following form: 可以在读取文件时将其应用于特定的列。 converters应以以下形式的字典形式传递:

{column_index: function_to_apply}

You could use this to apply a function to the third column. 您可以使用此功能将功能应用于第三列。 All you need to do is set the function to get a value from a dictionary d which maps "male" to 0 and "female" to 1 : 您所需要做的就是将函数设置为从字典d获取值,该字典将"male"映射为0 ,将"female"映射为1

>>> d = {"male": 0, "female": 1}
>>> pd.read_csv(file.csv, converters={2: d.get})
...
0 1 0 3 4 5 6
1 0 1 6 7 8 9
...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM