简体   繁体   中英

How to get multiple columns from a single column?

I have a column like this:

         Genre
Action|Crime|Drama|Thriller                 
Action|Crime|Thriller                          
Drama|Thriller                                 
Crime|Drama                                    
Horror|Thriller                                
Crime|Drama|Mystery|Thriller                   
Documentary                                    
Comedy|Crime                                   
Action|Adventure|Sci-Fi  
.....
so on.

what i want is output like multiple columns:

it generate various column of genre eg:
action  scifi crime adventure . . . . .
0       1      0     1     0  
1       0      0     0     0

Use .str.split , stack , and get_dummies :

df['Genre'].str.split('|',expand=True).stack().str.get_dummies().sum(level=0)

Output:

   Action  Adventure  Comedy  Crime  Documentary  Drama  Horror  Mystery  \
0       1          0       0      1            0      1       0        0   
1       1          0       0      1            0      0       0        0   
2       0          0       0      0            0      1       0        0   
3       0          0       0      1            0      1       0        0   
4       0          0       0      0            0      0       1        0   
5       0          0       0      1            0      1       0        1   
6       0          0       0      0            1      0       0        0   
7       0          0       1      1            0      0       0        0   
8       1          1       0      0            0      0       0        0   

   Sci-Fi  Thriller  
0       0         1  
1       0         1  
2       0         1  
3       0         0  
4       0         1  
5       0         1  
6       0         0  
7       0         0  
8       1         0  

First get that one column, then do .values[0] on this column.
Secondly use the previously generated string, split it by | into a list.
Using df[df[list]] should give you the response you want.

To conclude (for a single entry):

genres = list(df['Genre'].values[0].split('|'))
df[genres]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM