简体   繁体   中英

Python regex - Find multiple characters in a substring

print 'cycle' ;While i in range(1,n) [[print "Number:" ;print i; print 'and,']]

I have a line like this for example. I want to extract the semicolon characters only from the [[ ... ]] substring, inside the double square brackets.

If I use re.search(\\[\\[.*(\\s*;).*\\]\\]) I get only one semicolon. Is there a proper solution for this?

Regex is never a great choice for things like this because it's very easy to trip up, but the following pattern works in trivial cases :

;(?=(?:(?!\[\[).)*\]\])

Pattern breakdown:

;                # match literal ";"
(?=              # lookahead assertion: assert the following pattern matches:
    (?:          
        (?!\[\[) # as long as we don't find a "[["...
        .        # ...consume the next character
    )*           # ...as often as necessary
    \]\]         # until we find "]]"
)

In other words, the pattern checks if a semicolon is followed by ]] , but not followed by [[ .


Examples of strings where the pattern won't work:

  • ; ]] ; ]] (will match)
  • [[ ; "this is text [[" ]] [[ ; "this is text [[" ]] (won't match)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM