[英]Creating New columns from other pandas column
I would like to create a new Column from the genres column.我想从流派列创建一个新列。 The genres column contains one or multiple genres and I would like to create a column for each genre name.流派列包含一个或多个流派,我想为每个流派名称创建一个列。 Then, I would like to fill in 1 and 0 in each column depending on whether they have the genre.然后,我想根据他们是否有流派,在每列中填写 1 和 0。
Dataframe should look like in the image below.数据框应如下图所示。
I don't have any clue on this.我对此一无所知。
Using one hot encoder or pandas dummies function straight away didn't work as I got something like this立即使用一个热编码器或熊猫假人功能不起作用,因为我得到了这样的东西
I don't need something like this我不需要这样的东西
It looks like the values in the Genre
column were one-hot encoded.看起来Genre
列中的值是一次性编码的。 One-hot encoding is also know as referred to as creating dummy variables. One-hot 编码也称为创建虚拟变量。
Pandas has a function pd.get_dummies()
that should enable you one-hot encode the Genre
column. Pandas 有一个函数pd.get_dummies()
可以让你对Genre
列进行一次热编码。 Pass in your data frame and use the columns
parameter to select the Genre
column.传入您的数据框并使用columns
参数来选择Genre
列。
See the function documentation and other options here: https://pandas.pydata.org/docs/reference/api/pandas.get_dummies.html在此处查看函数文档和其他选项: https ://pandas.pydata.org/docs/reference/api/pandas.get_dummies.html
You can use CategoricalDtype
as below:您可以使用CategoricalDtype
如下:
import pandas as pd
from pandas.api.types import CategoricalDtype
df = pd.DataFrame({'country': ['Brazil', 'Australia',
'Canada','Brazil','Germany']})
pd.get_dummies(df,prefix=['country'])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.