I am seeking clarity as to why my code cannot access specific column values using dummie values using the following example data:
df
shop category subcategory season
date
2013-09-04 abc weddings shoes winter
2013-09-04 def jewelry watches summer
2013-09-05 ghi sports sneakers spring
2013-09-05 jkl jewelry necklaces fall
Here is my basic code:
wedding_df = df[["weddings","winter","summer","spring","fall"]]
I'm using Python 2 with my notebook, so it very well may be a version issue and require get_dummies()
, but some guidance would be helpful. Idea is to create a dummy dataframe that uses binary to say if a row had a wedding category and what season.
Here is an example of what I'm looking to achieve:
weddings winter summer spring fall
71654 1.0 0.0 1.0 0.0 0.0
72168 1.0 0.0 1.0 0.0 0.0
72080 1.0 0.0 1.0 0.0 0.0
with corr()
:
weddings fall spring summer winter
weddings NaN NaN NaN NaN NaN
fall NaN 1.000000 0.054019 -0.331866 -0.012122
spring NaN 0.054019 1.000000 -0.857205 0.072420
summer NaN -0.331866 -0.857205 1.000000 -0.484578
winter NaN -0.012122 0.072420 -0.484578 1.000000
You can try using prefix
and prefix_sep
assign them to blank , then you are able to df[["weddings","winter","summer","spring","fall"]]
df = pd.get_dummies(df,prefix = '', prefix_sep = '' )
df
abc def ghi jkl jewelry sports weddings necklaces shoes \
date
2013-09-04 1 0 0 0 0 0 1 0 1
2013-09-04 0 1 0 0 1 0 0 0 0
2013-09-05 0 0 1 0 0 1 0 0 0
2013-09-05 0 0 0 1 1 0 0 1 0
sneakers watches fall spring summer winter
date
2013-09-04 0 0 0 0 0 1
2013-09-04 0 1 0 0 1 0
2013-09-05 1 0 0 1 0 0
2013-09-05 0 0 1 0 0 0
Update
pd.get_dummies(df.loc[df['category']=='weddings',['category','season']],prefix = '', prefix_sep = '' )
Out[820]:
weddings winter
date
2013-09-04 1 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.