简体   繁体   中英

how to compare two strings in R

I have a dataset with two string variables. Both contains sentences I want to compare word by word. I want to create a new column ("new_var") which should look like this:

var1                   var2               new_var
"sentence numer one"  "setence numer two" sentence:setence + one:two
"another one is here" "aner one are hre"  another:aner + is:are + here:hre

I don't know how to write a code that will works on a dataset: add new column based on conditions and loop. My code works only when I defined objects var1 and var2 like it is.

library(stringr)

var1 = "this is sentence numer one"
var2 = "this is setence numer two"


new_var <- for (i in 1:(lengths(gregexpr("\\s+", var1)) + 1)) {
  if (word(string = var1, start = i, end = i) != word(string=var2, start=i, end=i)) 
  {
    cat(word(string = var1, start = i, end = i), word(string = var2, start = i, end = i), "+", sep=":")
  } else {
    cat("")
  } 
}

one possibility would be to use str_split and then map2 from the purrr package.

First I create some pseuda data:

x <- c("sentence number one", "another one is here")
y <- c("setence number two", "aner one are hre")

Then I transform it:

x2 <- str_split(x, " ")
y2 <- str_split(y, " ")

library(purrr)
map2(x2, y2, ~ifelse(.x == .y, "", paste(.x, .y, sep = ":")))

    [[1]]
[1] "sentence:setence" ""                 "one:two"         

[[2]]
[1] "another:aner" ""             "is:are"       "here:hre"   

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM