Currently I am working on a pandas data frame. Facing a problem related to group by operation. My data frame is -
Name A Job B
A Online Govt 22
B Offline Pvt 50
C Others Other 33
A Others Govt 62
B Online Pvt 18
C Offline Other 35
A Offline Govt 53
B Online Pvt 75
C Others Other 74
My final output should be -
Name Offline Online Others Govt Pvt Other
A 53 20 62 1 0 0
B 50 18 75 0 1 0
C 35 74 33 0 0 1
I have done the following code -
data = pd.read_csv('/Users../pandas_data/abc1.csv')
df3 = data.groupby(['Name', 'A'])['B']\
.first()\
.unstack(fill_value='NA')\
.rename_axis(None, 1)
Thanks in advance.
You should take a look at pandas' pivot_table
, for the second part of the output you can use
pd.pivot_table(
data=df.drop(columns='A'),
index=['Name'],
columns=['Job'],
aggfunc=lambda x: int(len(x) > 0),
fill_value=0
)
which would yield
B
Job Govt Other Pvt
Name
A 1 0 0
B 0 0 1
C 0 1 0
then do something similar for the first part (don't know how to interpret ..
), like
pd.pivot_table(
data=df.drop(columns='Job'),
index=['Name'],
columns=['A'],
aggfunc= # depends on what you expect
)
and finally concatenate using concat
over axis=1
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.