简体   繁体   中英

Find the "top N" in a group and find the average of the "top N" in R

       Rank       Laps   Average Time
1        1          1       30
2        2          1       34
3        3          1       35
4        1          2       32
5        2          2       33        
6        3          2       56
7        4          1       43
8        5          1       23
9        6          1       31
10       4          2       23
11       5          2       88
12       6          2       54

I would like to know how I can group ranks 1-3 and ranks 4-6 and get an average of the "average time" for each lap. Also, I would like this to extend if I have groups 7-9, 10-13, etc.

One option is to use cut to put the different ranks into groups, and add Laps as a grouping variable. Then, you can summarize the data to get the mean .

library(tidyverse)

df %>%
  group_by(gr = cut(Rank, breaks = seq(0, 6, by = 3)), Laps) %>%
  summarize(avg = mean(Average_Time))

Output

  gr     Laps   avg
  <fct> <int> <dbl>
1 (0,3]     1  33  
2 (0,3]     2  40.3
3 (3,6]     1  32.3
4 (3,6]     2  55  

Or another option if you want the range of ranks displayed for the group:

df %>%
  group_by(gr = cut(Rank, breaks = seq(0, 6, by = 3))) %>%
  mutate(Rank_gr = paste0(min(Rank), "-", max(Rank))) %>% 
  group_by(Rank_gr, Laps) %>% 
  summarize(avg = mean(Average_Time))

Output

  Rank_gr  Laps   avg
  <chr>   <int> <dbl>
1 1-3         1  33  
2 1-3         2  40.3
3 4-6         1  32.3
4 4-6         2  55  

Since you will have uneven groups, then you might want to use case_when to make the groups:

df %>%
  group_by(gr=case_when(Rank %in% 1:3 ~ "1-3",
                        Rank %in% 4:6 ~ "4-6",
                        Rank %in% 7:9 ~ "7-9",
                        Rank %in% 10:13 ~ "10-13"), 
           Laps) %>%
  summarize(avg = mean(Average_Time))

Data

df <- structure(list(Rank = c(1L, 2L, 3L, 1L, 2L, 3L, 4L, 5L, 6L, 4L, 
5L, 6L), Laps = c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 
2L), Average_Time = c(30L, 34L, 35L, 32L, 33L, 56L, 43L, 23L, 
31L, 23L, 88L, 54L)), class = "data.frame", row.names = c(NA, 
-12L))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM