I have this dataframe:
df:
type . size . margin . height
0 . A . 2 . 5 . 1
1 . A . 3 . 4 . 1
2 . B . 1 . 1 . 3
I want to groupby type, count the number of companies in each type and calculate the medians for all columns.
I know that for count is like this
df=df.groupby('type').count('type')
But is there a way to use a one liner and put everything in the same df?
Something like:
df=df.groupby('type').calculate_medians_and_counts
It should come out looking like this:
type count size margin height
A 2 2.5 4.5 1
B 1 1 1 3
(size, margin and height are the medians from df)
Use agg
by dictionary:
d = {'size':'median', 'margin':'median', 'height':'median', 'type':'size'}
Or if many columns is possible create dict
dynamically:
d = dict.fromkeys(df.columns.difference(['type']), 'median')
d['type'] = 'size'
df = df.groupby('type').agg(d).rename(columns={'type':'count'}).reset_index()
Another alternative with join
:
df = df.groupby('type').median().join(df.type.value_counts().rename('count')).reset_index()
print (df)
type margin size height count
0 A 4.5 2.5 1 2
1 B 1.0 1.0 3 1
I will using median
base on index level=0+ value_counts
pd.concat([df.set_index('type').median(level=0),df.type.value_counts()],1)
Out[787]:
size margin height type
type
A 2.5 4.5 1.0 2
B 1.0 1.0 3.0 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.