I have this example dataset and the actual has millions of rows, so I'd appreciate a data.table
solution but also a tidyverse
solution would be fine:
dat1 = data.frame(name = c("X1", "X1", "X1", "X2", "X2", "X2", "X2", "X2", "X2"),
year = c(2015,2016,2017,2015,2016,2016,2017,2017, 2018),
choice = c("o","o","o","o","o","r","r","o","o")
)
dat1
The logic I need to apply is:
If for any name and year combination only choice "o"
exists, retain the row with "o"
.
If for any name and year combination choices "o"
and "r"
exist, retain row with "r"
and drop row with "o"
. I don't want to name name
and year
combinations.
Does this work:
library(dplyr)
dat1 %>% group_by(name ,year) %>% filter(all(choice == 'o' )|choice == 'r')
# A tibble: 7 x 3
# Groups: name, year [7]
name year choice
<chr> <dbl> <chr>
1 X1 2015 o
2 X1 2016 o
3 X1 2017 o
4 X2 2015 o
5 X2 2016 r
6 X2 2017 r
7 X2 2018 o
library(data.table)
setDT(dat1)
dat1[, .SD[all(choice == "o") | choice == "r",], by = .(name, year)]
# name year choice
# 1: X1 2015 o
# 2: X1 2016 o
# 3: X1 2017 o
# 4: X2 2015 o
# 5: X2 2016 r
# 6: X2 2017 r
# 7: X2 2018 o
(I generated this before looking at KarthikS's answer, but the logic and the results are identical.)
An option is also to convert the column to factor
with levels
specified in the custom order and then select the first
levels
after dropping the levels with droplevels
library(dplyr)
dat1 %>%
group_by(name, year) %>%
filter(choice %in% levels(droplevels(factor(choice,
levels = c('r', 'o'))))[1])
# A tibble: 7 x 3
# Groups: name, year [7]
# name year choice
# <chr> <dbl> <chr>
#1 X1 2015 o
#2 X1 2016 o
#3 X1 2017 o
#4 X2 2015 o
#5 X2 2016 r
#6 X2 2017 r
#7 X2 2018 o
An equivalent option with data.table
is
library(data.table)
setDT(dat1)[dat1[, .I[choice %in%
levels(droplevels(factor(choice,
levels = c('r', 'o'))))[1]], .(name, year)]$V1]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.