简体   繁体   中英

Adding percentiles to datatable based on column groups

I have a datatable with scores and would like to add in the score columns the percentile based on the group and age they are in.

Age  Group  Score1 Score2  
22   A      95     85  
23   B      88     76  
25   B      84     56  
22   A      68     65  
25   B      76     85  
23   B      59     75 

So for example, the 22A 23B and 25B would all be separate in the calculation of percentiles.

Your posted example has very few cases for each group, so I'm using mtcars as an example:

library(dplyr)

# example data
df = mtcars %>% select(am, cyl, disp, wt)

Assume that am, cyl are your grouping variables and disp, wt are your scores:

df %>% 
  group_by(am, cyl) %>%
  mutate_at(vars(disp, wt), funs(prc = cume_dist)) %>%
  ungroup() %>%
  arrange(am, cyl) %>%  # not needed; only for visualisation
  data.frame()          # not needed; only for visualisation

#    am cyl  disp    wt  disp_prc     wt_prc
# 1   0   4 146.7 3.190 1.0000000 1.00000000
# 2   0   4 140.8 3.150 0.6666667 0.66666667
# 3   0   4 120.1 2.465 0.3333333 0.33333333
# 4   0   6 258.0 3.215 1.0000000 0.25000000
# 5   0   6 225.0 3.460 0.7500000 1.00000000
# 6   0   6 167.6 3.440 0.5000000 0.75000000
# 7   0   6 167.6 3.440 0.5000000 0.75000000
# 8   0   8 360.0 3.440 0.6666667 0.16666667
# 9   0   8 360.0 3.570 0.6666667 0.33333333
# 10  0   8 275.8 4.070 0.2500000 0.75000000
# 11  0   8 275.8 3.730 0.2500000 0.41666667
# 12  0   8 275.8 3.780 0.2500000 0.50000000
# 13  0   8 472.0 5.250 1.0000000 0.83333333
# 14  0   8 460.0 5.424 0.9166667 1.00000000
# 15  0   8 440.0 5.345 0.8333333 0.91666667
# 16  0   8 318.0 3.520 0.4166667 0.25000000
# 17  0   8 304.0 3.435 0.3333333 0.08333333
# 18  0   8 350.0 3.840 0.5000000 0.58333333
# 19  0   8 400.0 3.845 0.7500000 0.66666667
# 20  1   4 108.0 2.320 0.7500000 0.87500000
# 21  1   4  78.7 2.200 0.3750000 0.75000000
# 22  1   4  75.7 1.615 0.2500000 0.25000000
# 23  1   4  71.1 1.835 0.1250000 0.37500000
# 24  1   4  79.0 1.935 0.5000000 0.50000000
# 25  1   4 120.3 2.140 0.8750000 0.62500000
# 26  1   4  95.1 1.513 0.6250000 0.12500000
# 27  1   4 121.0 2.780 1.0000000 1.00000000
# 28  1   6 160.0 2.620 1.0000000 0.33333333
# 29  1   6 160.0 2.875 1.0000000 1.00000000
# 30  1   6 145.0 2.770 0.3333333 0.66666667
# 31  1   8 351.0 3.170 1.0000000 0.50000000
# 32  1   8 301.0 3.570 0.5000000 1.00000000

You can then round your percentiles to 2 decimal places, or create a % value and combine them with your actual scores in one column.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM