I have two tables that are joined. After the join, some of the values come out as NA.
I am trying to join again with a third data set, but only on those NA values. How do I do it?
The joined results
library(plyr)
## first table
original_value <- c('old_a', 'old_b', 'old_c', 'old_d')
key <- c('a', 'b', 'c', 'd')
data <- data.frame(key, original_value, stringsAsFactors = FALSE)
## lookup table
new_value <- c('new_a', 'new_b')
key <- c('a', 'b')
lookup <- data.frame(key, new_value, stringsAsFactors = FALSE)
## the joined data
data_lookup_joined <- join(data, lookup, by = "key")
> data_lookup_joined
key original_value new_value
1 a old_a new_a
2 b old_b new_b
3 c old_c <NA>
4 d old_d <NA>
This is the output I am trying to get:
## a third data set to join the NA values
unmatched_value <- c('unmatched_c', 'unmatched_d')
key <- c('c', 'd')
unmatched_lookup <- data.frame(key, unmatched_value, stringsAsFactors = FALSE)
key original_value new_value
1 a old_a new_a
2 b old_b new_b
3 c old_c unmatched_c
4 d old_d unmatched_d
This is what I have tried that did not work.
data_lookup_joined$new_value [is.na(data_lookup_joined$new_value)] <- join(data_lookup_joined, unmatched_lookup, by = "key")
What do I need to do?
# join the rows with missing values
has_na = is.na(data_lookup_joined$new_value)
na_join = join(data_lookup_joined[has_na, c("key", "original_value")], unmatched_lookup)
# make the column names match
names(na_join)[3] = "new_value"
# put it back together
final_result = rbind(data_lookup_joined[!has_na, ], na_join)
Of course, the simpler way would be to rbind
lookup
and unmatched_lookup
first, then you just need one join.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.