简体   繁体   中英

Check whether two vectors contain the same (unordered) elements in R

I'd like to check whether two vectors contain the same elements, even if they're not ordered the same. For example, the function (let's call it SameElements ) should satisfy these criteria:

SameElements(c(1, 2, 3), c(1, 2, 3))  # TRUE
SameElements(c(1, 2, 3), c(3, 2, 1))  # TRUE
SameElements(c(1, 2, 1), c(1, 2))  # FALSE
SameElements(c(1, 1, 2, 3), c(3, 2, 1))  # FALSE

Edit 1: Specified that function should return F when the vectors contain the same elements, but with different frequencies.

Edit 2: Cleaned up question to omit initial answer, as this is now in my actual answer.

I think you can use setequal(a,b)

Updated update setequal checks if two vectors are composed of the same elements but it does not check if these elements have the same occurrences in each vector.

In lieu of a cleaner alternative, here's the known solution:

SameElements <- function(a, b) return(identical(sort(a), sort(b)))
SameElements(c(1, 2, 3), c(1, 3, 2))  # TRUE
SameElements(c(1, 2, 3), c(1, 1, 3, 2))  # FALSE

Edit: identical instead of all.equal(...) == T per nrussell's suggestion.

You may be interested in the "compare" package. This answer demonstrates the compare() function, but for your case, you might do just fine with compareIgnoreOrder() (which matches almost exactly with your question's title).

There are several arguments that can be added as transformations that should be allowed in attempting to compare the elements. In the examples below (to save some typing), I've asked the function to allow all transformations ( allowAll = TRUE ) except for changing the vector length ( shorten = FALSE ).

library(compare)
compare(A1, A2, allowAll = TRUE, shorten = FALSE)
# TRUE
compare(A1, A3, allowAll = TRUE, shorten = FALSE)
# TRUE
#   sorted
compare(A1, A4, allowAll = TRUE, shorten = FALSE)
# FALSE
#   sorted
compare(B1, B2, allowAll = TRUE, shorten = FALSE)
# FALSE
#   sorted
compare(B1, A4, allowAll = TRUE, shorten = FALSE)
# FALSE
#   sorted
compare(A3f, A1, allowAll = TRUE, shorten = FALSE)
# TRUE
#   coerced from <numeric> to <factor>
#   sorted

Sample data:

A1 <- c(1, 2, 3); A2 <- c(1, 2, 3)
A3 <- c(3, 2, 1); A4 <- c(1, 1, 2, 3)
B1 <- c(1, 2, 1); B2 <- c(1, 2)
A3f <- factor(A3)

Here is my solution:

SameElements <- function (a,b){
  l <- Map(table,list(a, b)) # Compute frequencies - returns ordered table
  Reduce(identical,l) # Check if frequencies are the same for all input vectors
}

SameElements(c(1, 2, 3), c(1, 2, 3))  # TRUE
SameElements(c(1, 2, 3), c(3, 2, 1))  # TRUE
SameElements(c(1, 2, 1), c(1, 2))  # FALSE
SameElements(c(1, 1, 2, 3), c(3, 2, 1))  # FALSE

As you can see it works for any number of input vectors as long as you put them all in a list.

One liner:

Reduce(identical,Map(table,listOfVectors))

Basically your problem can be outlined in those steps :

if not same unique values: return FALSE
else if same Frequencies: return TRUE
    else return True

In proper R code :

SameElements = function(v1, v2)
{
  tab1 = table(v1) ; tab2 = table(v2)
  if( !all(names(tab1) == names(tab2)) ) return(FALSE) # Same unique values test
  return(all(tab1==tab2)) # Same frequencies test
}

Some examples :

v1 = c(1, 2, 3)
v2 = c(3, 2, 1)
SameElements(v1, v2) # TRUE as both uniqueness and frequencies test are verified

v1 = c(1,1, 2,3)
v2  =c(3,2,1)
S

ameElements(v1, v2) # FALSE as frequencies test is violated

PS : i) You can replace !all() by any()
~~~ii) To speed up the code you can quickly return FALSE when the two vectors don't
~~~have the same length, thus avoiding frequencies computation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM