I got stuck in a regular expression. I usually use this line of code to find overlapping repetitions in strings:
gregexpr("(?=ATGGGCT)",text,perl=TRUE)
[[1]]
[1] 16 45 52 75 203 210 266 273 327 364 436 443 480 506 534 570 649
attr(,"match.length")
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
attr(,"useBytes")
[1] TRUE
Now I want to give to gregexpr
a pattern contained in a variable:
x="GGC"
and of course if I pass the variable x
, gregexpr
is going to search "x"
and not what the variable contains
gregexpr("(?=x)",text,perl=TRUE)
[[1]]
[1] -1
attr(,"match.length")
[1] -1
attr(,"useBytes")
[1] TRUE
How can I pass my variable to gregexpr
in this case of positive look ahead?
I'd play with the sprintf
function:
x <- "AGA"
text <- "ACAGAGACTTTAGATAGAGAAGA"
gregexpr(sprintf("(?=%s)", x), text, perl=TRUE)
## [[1]]
## [1] 3 5 12 16 18 21
## attr(,"match.length")
## [1] 0 0 0 0 0 0
## attr(,"useBytes")
## [1] TRUE
sprintf
substitutes the occurrence of %s
by the value of x
.
You could use paste0
which is short for paste(x, sep="")
...
x <- "GGC"
text <- 'ATGGGCTATGGGCTATGGGCTATGGGCT'
gregexpr(paste0('(?=', x, ')'), text, perl=TRUE)
# [[1]]
# [1] 4 11 18 25
# attr(,"match.length")
# [1] 0 0 0 0
# attr(,"useBytes")
# [1] TRUE
And if you want to access the overlapping matches, take a look at Overlapping matches in R
The fn$
prefix in gsubfn package supports string interpolation:
library(gsubfn)
# test data
text <- "ATGGGCTAAATGGGCT"
x <- "GGGC"
fn$gregexpr("(?=$x)", text, perl = TRUE)
See ?fn
, the gsubfn home page and the gsubfn vignette, vignette("gsubfn")
.
ok I solved it in this way:
text="ATGGGCTAAATGGGCT"
x="GGC"
c=paste("(?=",x,")",sep="")
r=gregexpr(c,text,perl=TRUE)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.