简体   繁体   中英

Capture a group in a multi-line string with multiple matches using regular expression in Ruby

I'm trying to capture the String '1611650547*42' in the multiple line String bellow.

myString = "'/absfucate/wait.do;cohrAwdSessID=jbreW9yA8R0xh9b?
obfuscateId=jbreW9yA8R0xh9b&checksum=1611650547*42&tObfuscate=null&
tSession_1DS=null&obsfuscate3=DeptNLI8261138&
dispatchMethod=obfuscate'+ '&poll= 
8R0xh9b&checksum=1611650547*42&tSession=null'"

I'm using the the code bellow. And it captures two groups. When

/checksum=(?<checksum>\d*\*\d*)/m.match(myString)['checksum']

The capturing group checksum works for a string with one match, but when using multiple matches are found it throws the following error

undefined method `[]' for nil:NilClass (NoMethodError)

It's hard to be 100% sure about your input and the criteria revolving around the * . How about trying something a bit more specific (Ruby 2):

if myString =~ /(?m)checksum=\K\d*\*\d*/
    checksum = $&

What does the regex mean?

  • Use these options for the whole regular expression (?m)
    • &Dot matches line breaks m
  • Match the character string “checksum=” literally (case sensitive) checksum=
  • Keep the text matched so far out of the overall regex match \\K
  • Match a single character that is a “digit” (ASCII 0–9 only) \\d*
    • Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
  • Match the character “*” literally \\*
  • Match a single character that is a “digit” (ASCII 0–9 only) \\d*
    • Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
myString = "'/absfucate/wait.do;cohrAwdSessID=jbreW9yA8R0xh9b?
obfuscateId=jbreW9yA8R0xh9b&checksum=1611650547*42&tObfuscate=null&
tSession_1DS=null&obsfuscate3=DeptNLI8261138&
dispatchMethod=obfuscate'+ '&poll= 
8R0xh9b&checksum=1611650547*42&tSession=null'"

myString.scan(/checksum=[^&]+/) # => ["checksum=1611650547*42", "checksum=1611650547*42"]

Since your string contains two, and you don't say which one you want, pick one or the other, then continue processing:

myString.scan(/checksum=[^&]+/).first.split('=').last # => "1611650547*42"

Basically /checksum=[^&]+/ means: Find "checksum=" then the text following it until the next & . Once I have those strings it's easy to split them on = .

Regex aren't magic bullets, and will make your life more and more miserable the longer and more complex they become, so use them carefully and sparingly. Rather than try to process the entire line in one pattern, scan lets me use a small pattern to locate only what I want, and it handles the task of looping through the entire string.

If I was only after one of the occurrences, I'd use a pattern and match . These are equivalent to what you were after, only they're more succinct:

myString.match(/checksum=(?<checksum>[^&]+)/m)[:checksum] # => "1611650547*42"
myString.match(/checksum=(?<checksum>[\d*]+)/m)[:checksum] # => "1611650547*42"

For readability I'd use the pattern as the parameter for match , rather than chain match to the m flag.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM