简体   繁体   中英

Capturing a repeating group

I am trying to write a regex that would match and capture the following for me ...

String: 17+18+19+5+21

Numbers to be captured here (separately) are present in the array - [17,18,21].

Please note that the string can be n character long (following the same pattern of \\d+) and the order of these numbers in the string are not fixed.

Thanks in advance

Given this setup:

library(gsubfn)
s <- "17+18+19+5+21"
a <- c(17, 18, 21)

1) Try this:

L <- as.list(c(setNames(a, a), NA))
strapply(s, "\\d+", L, simplify = na.omit)

giving:

[1] 17 18 21
attr(,"na.action")
[1] 3 4
attr(,"class")
[1] "omit"

2) or this:

pat <- paste(a, collapse = "|")
strapplyc(s, pat, simplify = as.numeric)

giving:

[1] 17 18 21

3) or this non-regexp solution

intersect(scan(text = s, what = 0, sep = "+", quiet = TRUE), a)

giving

[1] 17 18 21

ADDED additional solution.

How about simply:

(17|18|21)

It needs to be a global match, so in Pearl it would be like this:

$string =~ m/(17|18|21)/g

Example string:

21+18+19+5+21+18+19+17

Matches:

"21", "18", "21", "18", "17"

Working regex example:

http://regex101.com/r/jL8iF7

Use can use gregexpr and regmatches :

vec <- "17+18+19+5+21"
a <- c(17, 18, 21) 

pattern <- paste0("\\b(", paste(a, collapse = "|"), ")\\b")
# [1] "\\b(17|18|21)\\b"

regmatches(vec, gregexpr(pattern, vec))[[1]]
# [1] "17" "18" "21"

Note that this matches the exact number, ie, 17 does not match 177 .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM