Subset the highest Values of every Group in R

Question

Data:

ID<-c(1,2,3,4,5,6,7,8)
Value<-c(5,4,7,2,6,3,9,4)
Group<-c(1,1,1,2,3,2,2,3)
data<-data.frame(ID,Value,Group)
I would like to take the  2 of every Group with the highest Values into a new DataFrame.

The Final Result should look like this: ID<-1,3,6,7,5,8 Value<-5,7,3,9,6,4 Group<-1,1,2,2,3,3 Finaldata<-(ID,Value,Group)

My approach is:

Finaldata<-head(data[order(Value,decreasing=TRUE),],n=2)

but I'm having issues to include that it should do it for every Group and not just for the Overall highest Values.

Answer 1

With "data.table" you can try something like this:

library(data.table)
as.data.table(data)[order(Group, -Value), head(.SD, 2), by = Group]
#    Group ID Value
# 1:     1  3     7
# 2:     1  1     5
# 3:     2  7     9
# 4:     2  6     3
# 5:     3  5     6
# 6:     3  8     4

Answer 2

using dplyr . If you are using dplyr_0.3 ie. the devel version, slice is available, otherwise, you could use do . You can install the devel version by:

devtools::install_github("hadley/dplyr") #first you need to install `devtools`.

Also, you can check the link https://github.com/hadley/dplyr

library(dplyr) 
data%>% 
    group_by(Group) %>%
    arrange(desc(Value)) %>%
    slice(1:2) # do(head(.,2)) #in dplyr 0.2

gives the result

#   ID Value Group
#1  3     7     1
#2  1     5     1
#3  7     9     2
#4  6     3     2
#5  5     6     3
#6  8     4     3

By using slice , you can get the 2nd highest value (ie slice(2) ) for each group or from any starting row to any end row which the dataset actually have. In this example (slice(2:3) gives 1 row for group 3 as there were only 2 rows in that group.

or using base R

data[with(data, ave(-Value, Group, FUN=rank)%in% 1:2),]
#  ID Value Group
#1  1     5     1
#3  3     7     1
#5  5     6     3
#6  6     3     2
#7  7     9     2
#8  8     4     3

Answer 3

Try:

ll = lapply(split(data, Group), function(x) tail(x[order(x$Value),],2) )
ll
$`1`
  ID Value Group
1  1     5     1
3  3     7     1

$`2`
  ID Value Group
6  6     3     2
7  7     9     2

$`3`
  ID Value Group
8  8     4     3
5  5     6     3

To bind to a data frame:

do.call(rbind, ll) 
    ID Value Group
1.1  1     5     1
1.3  3     7     1
2.6  6     3     2
2.7  7     9     2
3.8  8     4     3
3.5  5     6     3

or:

rbindlist(ll)
   ID Value Group
1:  1     5     1
2:  3     7     1
3:  6     3     2
4:  7     9     2
5:  8     4     3
6:  5     6     3

Subset the highest Values of every Group in R

Question

3 answers

solution1
4 ACCPTED 2014-10-06 17:28:03

solution2
1 2014-10-06 17:19:27

solution3
0 2014-10-06 17:27:09

Subset the highest Values of every Group in R

Question

3 answers

solution1 4 ACCPTED 2014-10-06 17:28:03

solution2 1 2014-10-06 17:19:27

solution3 0 2014-10-06 17:27:09

solution1
4 ACCPTED 2014-10-06 17:28:03

solution2
1 2014-10-06 17:19:27

solution3
0 2014-10-06 17:27:09