简体   繁体   中英

Smartest way to check if an observation in data.frame(x) exists also in data.frame(y) and populate a new column according with the result

Having two dataframes:

x <- data.frame(numbers=c('1','2','3','4','5','6','7','8','9'), coincidence="NA")

and

y <- data.frame(numbers=c('1','3','10'))

How can I check if the observations in y (1, 3 and 10) also exist in x and fill accordingly the column x["coincidence"] (for example with YES|NO, TRUE|FALSE...).

I would do the same in Excel with a formula combining IFERROR and VLOOKUP , but I don't know how to do the same with R.

Note: I am open to change data.frames to tables or use libraries. The dataframe with the numbers to check ( y ) will never have more than 10-20 observations, while the other one ( x ) will never have more than 1K observations. Therefore, I could also iterate with an if , if it´s necessary

We can create the vector matching the desired output with a set difference search that outputs boolean TRUE and FALSE values where appropriate. The sign %in% , is a binary operator that compares the values on the left-hand side to the set of values on the right:

x$coincidence <- x$numbers %in% y$numbers
# numbers coincidence
# 1       1        TRUE
# 2       2       FALSE
# 3       3        TRUE
# 4       4       FALSE
# 5       5       FALSE
# 6       6       FALSE
# 7       7       FALSE
# 8       8       FALSE
# 9       9       FALSE

Do numbers have to be factors, as you've set them up? (They're not numbers, but character.) If not, it's easy:

x <- data.frame(numbers=c('1','2','3','4','5','6','7','8','9'), coincidence="NA", stringsAsFactors=FALSE)
y <- data.frame(numbers=c('1','3','10'), stringsAsFactors=FALSE)

x$coincidence[x$numbers %in% y$numbers] <- TRUE


> x
  numbers coincidence
1       1        TRUE
2       2          NA
3       3        TRUE
4       4          NA
5       5          NA
6       6          NA
7       7          NA
8       8          NA
9       9          NA

If they need to be factors, then you'll need to either set common levels or use as.character().

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM