Finding elements in a vector that are duplicated or that are not in another vector

Question

I have the following situation:

vec1  <- c("A", "B", "D", "C", "E", "A", "C")
vec2 <- c("A", "B", "C", "D", "F")

First question: which one is duplicated? - answer "A" and "C" for vec1, 0 for vec2

Second question: Identify which is vec1 but not in vec2, irrespective of order (answer "E")

or vice versa (answer "F")

which(vec1 !=vec2)
which(vec2 !=vec1)

[1] 3 4 5 7
Warning message:
In vec1 != vec2 :
  longer object length is not a multiple of shorter object length

which is not what I expected....

Answer 1

For the first question, try ?duplicated

vec1.dup <- duplicated(vec1)
unique(vec1[vec1.dup])

[1] "A" "C"

For the second, try ?setdiff . You want the values of vec2 that are not in vec1.

setdiff(vec2, vec1)
[1] "F"

Answer 2

Elements in vec1 that are duplicated:

vec1[duplicated(vec1)]

[1] "A" "C"

Elements in vec1 that are not in vec2:

vec1[is.na(match(vec1,vec2))]

[1] "E"

And vice versa:

vec2[is.na(match(vec1,vec2))]

[1] "F"

Answer 3

It appears that your (second) question is..Why? ( I do see that you have gotten good answers to the correct... How? )

which(vec1 !=vec2)
which(vec2 !=vec1)

Both return

[1] 3 4 5 7

The answer lies in major part in the warning message you did not include:

Warning message:
In vec1 != vec2 :
  longer object length is not a multiple of shorter object length

When dyadic operators like ",=" work on vectors, the recycling rules take over so the longer of the two vectors determines the "range" for the comparisons. and the shorter one gets extended by recycling: You end up testing:

> c("A", "B", "C", "D", "F", "A", "B") != c("A", "B", "D", "C", "E", "A", "C")
                                         #.... extending shorter one ^^^^^^^
[1] FALSE FALSE  TRUE  TRUE  TRUE FALSE  TRUE
> c("A", "B", "D", "C", "E","A", "C") !=  c("A", "B", "C", "D", "F", "A", "B")
#.... extending shorter one ^^^^^^^
[1] FALSE FALSE  TRUE  TRUE  TRUE FALSE  TRUE

Finding elements in a vector that are duplicated or that are not in another vector

Question

3 answers

solution1
5 2012-07-05 01:40:49

solution2
3 ACCPTED 2012-07-05 01:38:51

solution3
3 2012-07-05 02:58:03

Finding elements in a vector that are duplicated or that are not in another vector

Question

3 answers

solution1 5 2012-07-05 01:40:49

solution2 3 ACCPTED 2012-07-05 01:38:51

solution3 3 2012-07-05 02:58:03

solution1
5 2012-07-05 01:40:49

solution2
3 ACCPTED 2012-07-05 01:38:51

solution3
3 2012-07-05 02:58:03