I have used adist
to calculate the number of characters that differ between two strings:
a <- "#IvoryCoast TENNIS US OPEN Clément «Un beau combat» entre Simon et Cilic"
b <- "Clément «Un beau combat» entre Simon et Cilic"
adist(a,b) # result 27
Now I would like to extract all the occurrences of those characters that differ. In my example, I would like to get the string "#IvoryCoast TENNIS US OPEN "
.
I tried and used:
paste(Reduce(setdiff, strsplit(c(a, b), split = "")), collapse = "")
But the obtained result is not what I expected!
#IvysTENOP
For this case, you could use gsub.
> a <- "#IvoryCoast TENNIS US OPEN Clément «Un beau combat» entre Simon et Cilic"
> b <- "Clément «Un beau combat» entre Simon et Cilic"
> gsub(b, "", a)
[1] "#IvoryCoast TENNIS US OPEN "
You can do, based on the paste/reduce
solution:
paste(Reduce(setdiff, strsplit(c(a, b), split = " ")), collapse = " ")
#[1] "#IvoryCoast TENNIS US OPEN"
Or, if you want to get separated items, with setdiff
and strsplit
:
setdiff(strsplit(a," ")[[1]],strsplit(b," ")[[1]])
#[1] "#IvoryCoast" "TENNIS" "US" "OPEN"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.