[英]Group By Median, Percentile and Percent of Total
I have a dataframe that looks like this...我有一个看起来像这样的 dataframe...
ID Acuity TOTAL_ED_LOS
1 2 423
2 5 52
3 5 535
4 1 87
...
I would like to produce a table that looks like this:我想制作一个如下所示的表格:
Acuity Count Median Percentile_25 Percentile_75 % of total
1 234 ... 31%
2 65 ... 8%
3 56 ... 7%
4 345 ... 47%
5 35 ... 5%
I already have code that will give me everything I need except for the % of total column我已经有代码可以提供我需要的一切,除了总列的百分比
def percentile(n):
def percentile_(x):
return np.percentile(x, n)
percentile_.__name__ = 'percentile_%s' % n
return percentile_
df_grp = df_merged_v1.groupby(['Acuity'])
df_grp['TOTAL_ED_LOS'].agg(['count','median',
percentile(25), percentile(75)]).reset_index()
Is there an efficient way I can add the percent of total column?有没有一种有效的方法可以添加总列的百分比? The link below contain code on how to obtain the percent of total but I'm unsure how to apply it to my code.
下面的链接包含有关如何获得总数百分比的代码,但我不确定如何将其应用于我的代码。 I know that I could create two tables and then merge them but am curious if there is a cleaner way.
我知道我可以创建两个表然后合并它们,但我很好奇是否有更清洁的方法。
How to calculate count and percentage in groupby in Python 如何在 Python 中计算 groupby 中的计数和百分比
Here's a one way to do it using some pandas builtin tools:这是使用一些 pandas 内置工具的一种方法:
# Set random number seeed and create a dummy datafame with two columns
np.random.seed(123)
df = pd.DataFrame({'activity':np.random.choice([*'ABCDE'], 40),
'TOTAL_ED_LDS':np.random.randint(50, 500, 40)})
# Reshape dataframe to get activit per column
# then use the output from describe and transpose
df_out = df.set_index([df.groupby('activity').cumcount(),'activity'])['TOTAL_ED_LDS']\
.unstack().describe().T
#Calculate percent count of total count
df_out['% of Total'] = df_out['count'] / df_out['count'].sum() * 100.
df_out
Output: Output:
count mean std min 25% 50% 75% max % of Total
activity
A 8.0 213.125000 106.810162 93.0 159.50 200.0 231.75 421.0 20.0
B 10.0 308.200000 116.105125 68.0 240.75 324.5 376.25 461.0 25.0
C 6.0 277.666667 117.188168 114.0 193.25 311.5 352.50 409.0 15.0
D 7.0 370.285714 124.724649 120.0 337.50 407.0 456.00 478.0 17.5
E 9.0 297.000000 160.812002 51.0 233.00 294.0 415.00 488.0 22.5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.