Smartest way to check if an observation in data.frame(x) exists also in data.frame(y) and populate a new column according with the result

Question

Having two dataframes:

x <- data.frame(numbers=c('1','2','3','4','5','6','7','8','9'), coincidence="NA")

and

y <- data.frame(numbers=c('1','3','10'))

How can I check if the observations in y (1, 3 and 10) also exist in x and fill accordingly the column x["coincidence"] (for example with YES|NO, TRUE|FALSE...).

I would do the same in Excel with a formula combining IFERROR and VLOOKUP , but I don't know how to do the same with R.

Note: I am open to change data.frames to tables or use libraries. The dataframe with the numbers to check ( y ) will never have more than 10-20 observations, while the other one ( x ) will never have more than 1K observations. Therefore, I could also iterate with an if , if it´s necessary

Answer 1

We can create the vector matching the desired output with a set difference search that outputs boolean TRUE and FALSE values where appropriate. The sign %in% , is a binary operator that compares the values on the left-hand side to the set of values on the right:

x$coincidence <- x$numbers %in% y$numbers
# numbers coincidence
# 1       1        TRUE
# 2       2       FALSE
# 3       3        TRUE
# 4       4       FALSE
# 5       5       FALSE
# 6       6       FALSE
# 7       7       FALSE
# 8       8       FALSE
# 9       9       FALSE

Answer 2

Do numbers have to be factors, as you've set them up? (They're not numbers, but character.) If not, it's easy:

x <- data.frame(numbers=c('1','2','3','4','5','6','7','8','9'), coincidence="NA", stringsAsFactors=FALSE)
y <- data.frame(numbers=c('1','3','10'), stringsAsFactors=FALSE)

x$coincidence[x$numbers %in% y$numbers] <- TRUE


> x
  numbers coincidence
1       1        TRUE
2       2          NA
3       3        TRUE
4       4          NA
5       5          NA
6       6          NA
7       7          NA
8       8          NA
9       9          NA

If they need to be factors, then you'll need to either set common levels or use as.character().

Smartest way to check if an observation in data.frame(x) exists also in data.frame(y) and populate a new column according with the result

Question

2 answers

solution1
4 ACCPTED 2016-01-14 17:08:28

solution2
0 2016-01-14 17:03:20

Smartest way to check if an observation in data.frame(x) exists also in data.frame(y) and populate a new column according with the result

Question

2 answers

solution1 4 ACCPTED 2016-01-14 17:08:28

solution2 0 2016-01-14 17:03:20

solution1
4 ACCPTED 2016-01-14 17:08:28

solution2
0 2016-01-14 17:03:20