Trouble matching multiple patterns in Ruby (regex)

Question

What is the difference between the floowing regexes: HEAD|GET , (HEAD|POST) & [HEAD|POST] ?

Basically, I want to extract the number after either HEAD or POST.

irb(main):001:0> "This is HEAD and a POST".match("HEAD|POST")
=> #<MatchData "HEAD">
irb(main):002:0> "This is HEAD and a POST".match("(HEAD|POST)")
=> #<MatchData "HEAD" 1:"HEAD">
irb(main):003:0> "This is HEAD and a POST".match("[HEAD|POST]")
=> #<MatchData "T">
irb(main):004:0> "This is HEAD 1 and a POST 2".match("[HEAD|POST] (.)")
=> #<MatchData "D 1" 1:"1">
irb(main):005:0>

The last regex didn't match the "2" that is after "POST". Why? Also, why is "D 1" being matched?

Answer 1

HEAD|POST and (HEAD|POST) match the same strings (either HEAD or POST); the second one captures the string while the first doesn't.

[HEAD|POST] matches a single character, any of ADEHOPST or |. So "This is HEAD and a POST".match("[HEAD|POST]") matches the single character T in This .

On the other hand, "This is HEAD 1 and a POST 2".match("[HEAD|POST] (.)") can't match the leading T because it isn't followed by a space - instead it matches the single D at the end of HEAD , plus the space and 1 following, capturing the 1.

Answer 2

try scan:

"This is HEAD 1 and a POST 2".scan /(HEAD|POST)\s(\d)/

=> [["HEAD", "1"], ["POST", "2"]]

Trouble matching multiple patterns in Ruby (regex)

Question

2 answers

solution1
4 ACCPTED 2012-07-11 13:19:10

solution2
1 2012-07-11 13:16:14

Trouble matching multiple patterns in Ruby (regex)

Question

2 answers

solution1 4 ACCPTED 2012-07-11 13:19:10

solution2 1 2012-07-11 13:16:14

solution1
4 ACCPTED 2012-07-11 13:19:10

solution2
1 2012-07-11 13:16:14