简体   繁体   中英

How to match repeating patterns containing special characters in Ruby using regex?

Basically, I am trying to use regex to match repeating patterns containing special characters in Ruby. I've been able to do this if I am given the number of times a pattern repeats but not dynamically. An example string I am looking to match is:

Draw a square that is {{coords.width}} pixels wide by {{coords.height}} pixels tall.

This can be easily done by using

arr = value.scan(/\\{\\{(\\w+?\\.\\w+?)\\}\\}/).flatten

arr looks like this after I run this

["coords.width", "coords.height"]

But how do I write a regex which can match in case this pattern follows arbitrarily, for example

Draw a square that is {{shape.rectangle.coords.width}} pixels wide by {{shape.rectangle.coords.height}} pixels tall.

while also matching in case of the following(no ".")

Draw a square that is {{width}} pixels wide by {{height}} pixels tall.

You can match the regular expression

r = /(?<=\{\{)[a-z]+(?:\.[a-z]+)*(?=\}\})/

Rubular demo / PCRE demo at regex 101.com

I've included the PCRE demo because regex101.com provides a detailed explanation of each element of the regex (hover the cursor).

For example,

str = "Draw a square {{coords.width}} wide by {{coords.height}} " +
      "tall by {{coords deep}} deep"
str.scan(r)
  #=> ["coords.width", "coords.height"]

Notice that "coords deep" does not match because it does not have (what I have assumed is) a valid form. Notice also that I did not have to flatten the return value from scan because the regex has no capture groups.

We can write the regular expression in free-spacing mode to make it self-documenting.

/
(?<=      # begin a positive lookbehind
  \{\{    # match 1 or more lower case letters
)         # end the positive lookbehind
[a-z]+    # match 1 or more lower case letters
(?:       # begin a non-capture group
  \.      # match a period
  [a-z]+  # match 1 or more lower case letters
)         # end the non-capture group
*         # execute the non-capture group zero or more times
(?=       # begin a positive lookahead
  \}\}    # match '}}'
)         # end positive lookahead
/x        # free-spacing regex definition mode

(/\\{\\{(.*?)\\}\\}/)

This did the trick. It matches anything that's within {{}} but I can always validate the structure upon extracting the occurrences/patterns

(\\{+\\S+)

The pattern above would achieve your goals. It matches all non space characters within the enclosure.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM