This should be simple, but i'm new to working in python. Any suggestions please?
#original dataframe
df = pd.DataFrame({'year':[1,1,1,1,1],
'month':[4,4,4,4,4],
'mode': ['a','b','a','a','b']},
columns=['year','month','mode'])
#pivot/groupby etc
# df2=df.pivot(columns=('year','month'), values=('mode')).count()
#create this dataframe
df2 = pd.DataFrame({'year':[1],
'month':[4],
'a': [3],
'b':[2]},
columns=['year','month','a','b'])
I'm working in Koalas Apache Spark environment ( documentation ), so solution should work on it.
df.pivot_table(index=['year','month'], aggfunc='size', columns='mode')
Alternatively You can use pd.get_dummies()
:
pd.get_dummies(df).groupby(['year','month']).sum()
result:
mode_a mode_b
year month
1 4 3 2
Note: I'm not sure that works in Koalas Apache Spark environment.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.