使用pandas.get_dummies

Question

So essentially I have a data frame with a bunch of columns, some of which I want to keep (stored in to_keep) and some other columns that I want to create categorical variables for using pandas.get_dummies (these are stored in to_change). 因此，从本质上讲，我有一个带有一堆列的数据框，其中一些我想保留（存储在to_keep中），另一些我想创建分类变量以使用pandas.get_dummies（这些存储在to_change中）。

However, I can't seem to get the syntax of how to do this down, and all the examples I have seen (ie here: http://blog.yhat.com/posts/logistic-regression-and-python.html ), don't seem to help. 但是，我似乎无法了解如何执行此操作的语法以及我所看到的所有示例（即，这里： http : //blog.yhat.com/posts/logistic-regression-and-python.html ），似乎没有帮助。

Here's what I have at present: 这是我目前拥有的：

new_df = df.copy()
dummies= pd.get_dummies(new_df[to_change])
new_df = new_df[to_keep].join(dummies)
return new_df

Any help on where I am going wrong would be appreciated, as the problem I keep running into is that this only adds categorical variables for the first column in to_change. 对于我要去哪里的任何帮助，将不胜感激，因为我一直遇到的问题是，这只会为to_change的第一列添加分类变量。

Answer 1

Didn't understand the problem completely, I must say. 我必须说，我不完全理解问题。

However, say your DataFrae is df , and you have a list of columns to_make_categorical . 但是，假设您的DataFrae是df ，并且有to_make_categorical列的列表。

The DataFrame with the non-categorical columns, is 具有非分类列的DataFrame是

wo_categoricals = df[[c for c in list(df.columns) if c not in to_make_categorical]]

The DataFrames of the categorical expansions are 类别扩展的DataFrames是

categoricals = [pd.get_dummies(df[c], prefix=c) for c in to_make_categorical]

Now you could just concat them horizontally: 现在，您可以水平连接它们：

pd.concat([wo_categoricals] + categoricals, axis=1)

使用pandas.get_dummies

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-01-28 19:48:19

使用pandas.get_dummies

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-01-28 19:48:19

解决方案1
2 已采纳 2016-01-28 19:48:19