Let's assume two dataframes: A and B containing data like the following one:
Dataframe: A Dataframe: B
ColA ColB1 ColB2
| Dog | | Lion | yes
| Lion | | Cat |
| Zebra | | Elephant |
| Bat | | Dog | yes
Want to compare the values of ColA to the values of ColB1, in order to insert yes in case of match in column ColB2. What I'm running is this:
for (i in 1:nrow(B)){
for (j in 1:nrow(A)){
if (B[i,1] == A[j,1]){
B[i,2] <- "yes"
}
}
}
In reality we re talking abaout 20000 lines. How could this become faster?
You can use the %in%
operator to determine membership:
B$ColB2 <- B$ColB1 %in% A$ColA
ColB2
will contain TRUE/FALSE
dependent on whether value in ColB1
of data frame B
was found in ColA
of data frame A
.
For more info see:
https://stat.ethz.ch/R-manual/R-devel/library/base/html/match.html
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.