Good evening!
I have a base with the following characteristic:
Year source destination HS04 value
1989 ARG BRA 0101 1
1989 ARG BRA 0102 0
1989 ARG BRA 0103 0
1989 ARG BRA 0104 1
. . . . .
. . . . .
. . . . .
2010 ARG BRA 0101 1
2010 ARG BRA 0102 1
2010 ARG BRA 0103 1
2010 ARG BRA 0104 1
I need to eliminate the HS04 observations that did not vary over the period. That is, HS04 0101 and HS04 0104 since both initial and final years had a value of 1.
The reference would be HS04, ie for a given pair of countries (eg ARG and BRA) HS04 varied between the initial period and the final period.
The sample period covers 1989-2010
Thanks in advance for your attention!
We can try
data %>% group_by(HS04) %>%
mutate(flag = ifelse(min(value) & max(value)==1, 1, 0)) %>%
filter(flag==0) %>% ungroup()
Data
data <- read.table(text = "
Year source destination HS04 value
1989 ARG BRA 0101 1
1989 ARG BRA 0102 0
1989 ARG BRA 0103 0
1989 ARG BRA 0104 1
2010 ARG BRA 0101 1
2010 ARG BRA 0102 1
2010 ARG BRA 0103 1
2010 ARG BRA 0104 1
",header=T)
using @A. Suliman data, n_distinct()
exactly fits your need:
data %>% group_by(HS04) %>%
filter(n_distinct(value) >1) %>% ungroup()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.