I can't figure out why this R regular expression doesn't match both of these two strings. As I understand it, this expression should match any string with the same lower-case letter appearing twice within the string. It matches one of the strings ("Moro"), but not the second ("moro"), even though both strings contain a repeated lower-case "o". What's going on here?
Executed in R (3.4.3):
grep("([az]).*\\\\1", c("Moro", "moro"), value=TRUE)
[1] "Moro"
The same thing occurs with this regex, which I believe is identical to the one above:
grep("([[:lower:]]).*\\\\1", c("Moro", "moro"), value=TRUE)
[1] "Moro"
Thanks for any help!
This seems to be a regex flavor issue. If you set perl = T
, it works:
grep("([a-z]).*\\1", c("Moro", "moro", "mora"), value=TRUE, perl = T)
# [1] "Moro" "moro"
Worth noting that stringr
and stringi
work out-of-the-box:
stringr::str_detect(c("Moro", "moro", "mora"), "([a-z]).*\\1")
# [1] TRUE TRUE FALSE
stringi::stri_detect(c("Moro", "moro", "mora"), regex = "([a-z]).*\\1")
# [1] TRUE TRUE FALSE
I'm not sure but my guess is because it tries to match any character. If you use simple [o]
it will work:
grep("([a-z]).*\\1", c("Moro", "moro"), value=TRUE)
# [1] "Moro"
grep("([o]).*\\1", c("Moro", "moro"), value=TRUE)
# [1] "Moro" "moro"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.