简体   繁体   中英

Match string numbers with Ruby regex

I need help to match string(s) with ruby regular expression. (it's for puppet)

  1. How can I match everything that has the numbers: 001 to 010, in the end.

Example: master001, master002, master003

  1. And then I need to match everything that starts with: 011 to 999 in the end.

Example: master011, master012 ..... master997, master998, master999

How can I match everything that has the numbers: 001 to 010, in the end.

\w+0(?:0[1-9]|10)

And then I need to match everything that starts with: 011 to 999 in the end.

\w+(?:0[1-9]|[1-9]\d)\d

See it live here and here
And as suggested by @Cary, you can run it with str.scan

My first attempt would be those 2 :

"master001".match(/010$|00[1-9]$/) #=> "001"  up to "009" "010"
"master099".match(/0[1-9]\d$|[1-9]\d\d$/) #=> "011" up to "999"

Edit : My 2nd attempt would be those :

"master001".match(/010$|00[1-9]$/) #=> "001"  up to "009" "010"
"master099".match(/0[1-9]\d$|[1-9]\d\d$/) #=> "010" up to "999"

The second regex catches 010 but that's okay if you already cought it in the 1st one.

Anyway kudos to @Cyrbil.

"Everything" in "How can I match everything...". is quite vague. Can "everything" contain any characters, including spaces? What about "cat_1001", which is comprised entirely of word characters ( "cat_1001" =~ /\\w+/ #=> 0 )? That string ends with the (string representation of the) number "1001" but whose last three characters are "001"? Should it be a match? Do you want to match the string "007" (three digits with nothing before)? I have assumed you want to match strings that:

  • start at the beginning of the string or are preceded by a non-letter
  • have one more letters (uppercase or lowercase)
  • have three digits
  • are at the end of the string or are followed by a non-digit

Suppose the string were:

str = "Ann010, Bee012, Bob001 and Hank999a are MI6; 007, Deb0001 and Paul000 aren't"

Applying the rules for matching that I've adopted, the first group (1-10) is comprised of Ann and Bob; the second group (11-999), Bee and Hank.

This can be accomplished with the following regex:

r = /
     [a-z]+ # match one or more letters
     \d{3}  # match three digits
     # (?!\d) # do not match another digit (negative lookahead)
    /ix     # case-indifferent and extended/free-spacing modes

to extract candidates:

arr = str.scan(r)
  #=> ["Ann010", "Bee012", "Bob001", "Hank999", "Deb000", "Paul000"] 

which can then be extracted as desired:

arr.select { |s| (1..10).cover? s[-3..-1].to_i }
  #=> ["Ann010", "Bob001"] 
arr.select { |s| (11..999).cover? s[-3..-1].to_i }
  #=> ["Bee012", "Hank999"] 

Cyrbil's answer looks nice but it's a thinker and it overlooks stuff. You can play it safe with the somewhat uglier:

/\w+(?:#{('001'..'010').to_a.join('|')})\b/

and

/\w+(?:#{('011'..'999').to_a.join('|')})\b/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM