python panda：在列中查找特定字符串并填充与字符串匹配的列

Question

I have a dataframe with several columns.我有一个包含几列的数据框。 One of them is filled with "genres" of movie separated by |, I've splitted this column in several others to get X columns each filled with the splitted value.其中之一充满了由 | 分隔的电影“流派”，我将此列拆分为其他几列，以获得 X 列，每个列都填充了拆分值。 However what I'd need is to have 1 column for each "genre" that gets filled by 1 or 0 depending on if the header of the column is found in either the nominal genres columns or in one of the splitted column.但是，我需要为每个“流派”设置 1 列，填充 1 或 0，具体取决于列的标题是在名义流派列中还是在拆分列之一中找到。 I get my dataframe set up like this:我的数据框设置如下：

    df = pd.DataFrame({'A': ['drama|Action', 'Drama', 'Action'], 'A_split1': ['Drama', 'Drama', 'Action'],'A_split2': ['Action', 'None', 'None'],'Drama': [0, 0, 0], 'Action': [0, 0, 0], 'Western': [0, 0, 0]},
                  index = ['a1', 'a2', 'a3'])
    df

But I didn't find how to do the check if name of header is within a string to add the 1 or 0.但是我没有找到如何检查标题名称是否在字符串中以添加 1 或 0。

Answer 1

I think you need pop for extract column with str.get_dummies and join to original:我认为你需要pop的提取塔与str.get_dummies并join到原文：

df = pd.DataFrame({'A': ['Drama|Action', 'Drama', 'Action'], 'B':range(3)},
                  index = ['a1', 'a2', 'a3'])
print (df) 
               A  B
a1  Drama|Action  0
a2         Drama  1
a3        Action  2

df = df.join(df.pop('A').str.get_dummies())
print (df)
    B  Action  Drama
a1  0       1      1
a2  1       0      1
a3  2       1      0

If want original column:如果想要原始列：

df = df.join(df['A'].str.get_dummies())
print (df)
               A  B  Action  Drama
a1  Drama|Action  0       1      1
a2         Drama  1       0      1
a3        Action  2       1      0

python panda：在列中查找特定字符串并填充与字符串匹配的列

问题描述

1 个解决方案

解决方案1
6 已采纳 2018-02-24 10:08:55

python panda：在列中查找特定字符串并填充与字符串匹配的列

问题描述

1 个解决方案

解决方案1 6 已采纳 2018-02-24 10:08:55

解决方案1
6 已采纳 2018-02-24 10:08:55