Subset dataframe by unique values within a column in R

Question

Hello I have a dataframe such as

Group COL1 Event 
G1 SP1  1
G1 SP2  1
G1 SP3  2
G1 SP3  2 
G2 SP4  3
G2 SP7  3
G2 SP5  6
G3 SP1  1 
G4 SP1  6

And I want to keep only COL1 if Event is unique (so here for exemple SP3 and SP5 are unique within the column Event).

Then I should get:

Group COL1 Event 
G1 SP3  2
G1 SP3  2 
G2 SP5  6 
G3 SP1  1 
G4 SP1  6

SP1 and SP2 were 2 in column Event1 so they do not pass

SP4 and SP7 were 2 in column Event3 so they do not pass

Answer 1

You can use data.table to group by Group and Event and only return the group contents ( .SD ) if the number of unique COL1 values ( uniqueN(COL1) ) is 1.

library(data.table)
setDT(df)

df[, if(uniqueN(COL1) == 1) .SD, by = .(Group, Event)]
#    Group Event COL1
# 1:    G1     2  SP3
# 2:    G1     2  SP3
# 3:    G2     6  SP5
# 4:    G3     1  SP1
# 5:    G4     6  SP1

Data used:

df <- fread('
Group COL1 Event 
G1 SP1  1
G1 SP2  1
G1 SP3  2
G1 SP3  2 
G2 SP4  3
G2 SP7  3
G2 SP5  6
G3 SP1  1 
G4 SP1  6  
')

Answer 2

An option with base R using ave

subset(df, ave(COL1, Group, Event,
      FUN = function(x) length(unique(x))) == 1)
#  Group COL1 Event
#3    G1  SP3     2
#4    G1  SP3     2
#7    G2  SP5     6
#8    G3  SP1     1
#9    G4  SP1     6

Answer 3

Another data.table option

> setDT(df)[,.SD[uniqueN(COL1)==1],.(Group,Event)]
   Group Event COL1
1:    G1     2  SP3
2:    G1     2  SP3
3:    G2     6  SP5
4:    G3     1  SP1
5:    G4     6  SP1

Subset dataframe by unique values within a column in R

Question

3 answers

solution1
3 ACCPTED 2021-01-26 14:10:18

solution2
3 2021-01-26 19:25:24

solution3
2 2021-01-26 21:06:46

Subset dataframe by unique values within a column in R

Question

3 answers

solution1 3 ACCPTED 2021-01-26 14:10:18

solution2 3 2021-01-26 19:25:24

solution3 2 2021-01-26 21:06:46

solution1
3 ACCPTED 2021-01-26 14:10:18

solution2
3 2021-01-26 19:25:24

solution3
2 2021-01-26 21:06:46