使用数值的虚拟变量创建固定大小的数据框

Question

I must create the dummy variables for the column that could have 16 values (0-15), but not necessary has all 16 values when I create dummy variables based on it: 我必须为可能具有16个值（0-15）的列创建伪变量，但是当我基于该列创建伪变量时，不必具有所有16个值：

  my_column
0  3
1  4
2  7
3  1
4  9

I expect my dummy variables have 16 columns, or more - any another value that fixed by me in advance, and the number in the name of column corresponds to the value of my_column , but if my_column have only , let's say, 5 values from 16 possible values, the method pd.get_dummies will create only 5 columns (as expected from this method though) as following : 我希望我的伪变量具有16列或更多列-我预先确定的任何其他值，并且列名中的数字对应于my_column的值，但是如果my_column仅具有，则假设16个中有5个值可能的值，方法pd.get_dummies将仅创建5列（尽管此方法期望如此），如下所示：

 my_column  1  3  4  7  9
0  3        0  1  0  0  0
1  4        0  0  1  0  0
2  7        0  0  0  1  0
3  1        1  0  0  0  0
4  9        0  0  0  0  1

How can I achieve one of the following results ? 如何获得以下结果之一？

 my_column   0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
    0  3     0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0
    1  4     0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0
    2  7     0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0
    3  1     0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0
    4  9     0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0

Answer 1

Use get_dummies + reindex on the columns - 在列上使用get_dummies + reindex -

v = pd.get_dummies(df.my_column).reindex(columns=range(0, 16), fill_value=0)

According to the docs, reindex will - 根据文档， reindex将-

Conform DataFrame to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index. 使用可选的填充逻辑使DataFrame与新索引一致，将NA / NaN放在上一个索引中没有值的位置。

fill_value=0 will fill all missing columns with zeros. fill_value=0将用零填充所有缺少的列。

You can add the original column to the result with insert or concat - 您可以使用insert或concat将原始列添加到结果中-

v.insert(0, 'my_column', df.my_column)

v = pd.concat([df, v], 1)   # alternative to insert

v

   my_column  0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15
0          3  0  0  0  1  0  0  0  0  0  0   0   0   0   0   0   0
1          4  0  0  0  0  1  0  0  0  0  0   0   0   0   0   0   0
2          7  0  0  0  0  0  0  0  1  0  0   0   0   0   0   0   0
3          1  0  1  0  0  0  0  0  0  0  0   0   0   0   0   0   0
4          9  0  0  0  0  0  0  0  0  0  1   0   0   0   0   0   0

使用数值的虚拟变量创建固定大小的数据框

问题描述

1 个解决方案

解决方案1
3 已采纳 2017-12-18 12:47:49

使用数值的虚拟变量创建固定大小的数据框

问题描述

1 个解决方案

解决方案1 3 已采纳 2017-12-18 12:47:49

解决方案1
3 已采纳 2017-12-18 12:47:49