简体   繁体   中英

python 2 equivalent to get_dummies with pandas df

I am seeking clarity as to why my code cannot access specific column values using dummie values using the following example data:

df

            shop   category  subcategory     season
date                
2013-09-04  abc    weddings  shoes           winter
2013-09-04  def    jewelry   watches         summer
2013-09-05  ghi    sports    sneakers        spring
2013-09-05  jkl    jewelry   necklaces       fall

Here is my basic code:

wedding_df = df[["weddings","winter","summer","spring","fall"]]

I'm using Python 2 with my notebook, so it very well may be a version issue and require get_dummies() , but some guidance would be helpful. Idea is to create a dummy dataframe that uses binary to say if a row had a wedding category and what season.

Here is an example of what I'm looking to achieve:

        weddings    winter  summer  spring  fall
71654   1.0         0.0     1.0     0.0     0.0
72168   1.0         0.0     1.0     0.0     0.0
72080   1.0         0.0     1.0     0.0     0.0

with corr() :

         weddings   fall     spring    summer      winter
weddings NaN        NaN      NaN        NaN        NaN
fall     NaN       1.000000  0.054019   -0.331866   -0.012122
spring   NaN       0.054019  1.000000   -0.857205   0.072420
summer   NaN       -0.331866 -0.857205  1.000000    -0.484578
winter   NaN       -0.012122 0.072420   -0.484578   1.000000

You can try using prefix and prefix_sep assign them to blank , then you are able to df[["weddings","winter","summer","spring","fall"]]

df = pd.get_dummies(df,prefix = '', prefix_sep = '' )
df
            abc  def  ghi  jkl  jewelry  sports  weddings  necklaces  shoes  \
date                                                                          
2013-09-04    1    0    0    0        0       0         1          0      1   
2013-09-04    0    1    0    0        1       0         0          0      0   
2013-09-05    0    0    1    0        0       1         0          0      0   
2013-09-05    0    0    0    1        1       0         0          1      0   
            sneakers  watches  fall  spring  summer  winter  
date                                                         
2013-09-04         0        0     0       0       0       1  
2013-09-04         0        1     0       0       1       0  
2013-09-05         1        0     0       1       0       0  
2013-09-05         0        0     1       0       0       0  

Update

pd.get_dummies(df.loc[df['category']=='weddings',['category','season']],prefix = '', prefix_sep = '' )
Out[820]: 
            weddings  winter
date                        
2013-09-04         1       1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM