Get minimum grouped by unique combination of two columns

Question

What I'm trying to achieve in R is the following: given a table (data frame in my case) - I want to be get the lowest price for each unique combination of two columns.

For example, given the following table:

+-----+-----------+-------+----------+----------+
| Key | Feature1  | Price | Feature2 | Feature3 |
+-----+-----------+-------+----------+----------+
| AAA |         1 |   100 | whatever | whatever |
| AAA |         1 |   150 | whatever | whatever |
| AAA |         1 |   200 | whatever | whatever |
| AAA |         2 |   110 | whatever | whatever |
| AAA |         2 |   120 | whatever | whatever |
| BBB |         1 |   100 | whatever | whatever |
+-----+-----------+-------+----------+----------+

I want a result that looks like:

+-----+-----------+-------+----------+----------+
| Key | Feature1  | Price | Feature2 | Feature3 |
+-----+-----------+-------+----------+----------+
| AAA |         1 |   100 | whatever | whatever |
| AAA |         2 |   110 | whatever | whatever |
| BBB |         1 |   100 | whatever | whatever |
+-----+-----------+-------+----------+----------+

So I'm working on a solution along the lines of:

s <- lapply(split(data, list(data$Key, data$Feature1)), function(chunk) { 
        chunk[which.min(chunk$Price),]})

But the result is a 1 xn matrix - so I need to unsplit the result. Also - it seems very slow. How can I improve this logic? I've seen solutions pointing in the directions of the data.table package. Should I re-write using that package?

Update

Great answers guys - thanks! However - my original dataframe contains more columns ( Feature2 ... ) and I need them all back after the filtering. The rows that do not have the lowest price ( for the combination of Key/Feature1 ) can be discarded, so I'm not interested in their values for Feature2 / Feature3

Answer 1

You can use the dplyr package:

library(dplyr)

data %>% group_by(Key, Feature1) %>%
         slice(which.min(Price))

Answer 2

Since you referred to data.table package, I provide here the solution using that package:

library(data.table)
setDT(df)[,.(Price=min(Price)),.(Key, Feature1)] #initial question
setDT(df)[,.SD[which.min(Price)],.(Key, Feature1)] #updated question

df is your sample data.frame.

Update: Test using mtcars data

df<-mtcars
library(data.table)
setDT(df)[,.SD[which.min(mpg)],by=am]
   am  mpg cyl disp  hp drat   wt  qsec vs gear carb
1:  1 15.0   8  301 335 3.54 3.57 14.60  0    5    8
2:  0 10.4   8  472 205 2.93 5.25 17.98  0    3    4

Answer 3

基础R解决方案将是aggregate(Price ~ Key + Feature1, data, FUN = min)

Answer 4

Using R base aggregate

> aggregate(Price~Key+Feature1, min, data=data)
  Key Feature1 Price
1 AAA        1   100
2 BBB        1   100
3 AAA        2   110

See this post for other alternatives.

Get minimum grouped by unique combination of two columns

Question

4 answers

solution1
3 ACCPTED 2015-07-10 15:22:47

solution2
3 2015-07-10 15:24:28

solution3
1 2015-07-10 15:25:01

solution4
0 2015-07-10 15:27:51

Get minimum grouped by unique combination of two columns

Question

4 answers

solution1 3 ACCPTED 2015-07-10 15:22:47

solution2 3 2015-07-10 15:24:28

solution3 1 2015-07-10 15:25:01

solution4 0 2015-07-10 15:27:51

solution1
3 ACCPTED 2015-07-10 15:22:47

solution2
3 2015-07-10 15:24:28

solution3
1 2015-07-10 15:25:01

solution4
0 2015-07-10 15:27:51