简体   繁体   中英

Regex with an optional group part

I have these two kinds of string to match and group:

<133>[S=88121248] [SID:1073710562] (   lgr_psbrdif)(72811810  )   #38:OpenChannel:on Trunk 0 BChannel:9 CID=38 with VoiceCoder: g711Alaw64k20 VbdCoder: InvalidCoder255 DetectorSide: 0 FaxModemDet NO_FAX_MODEM_DETECTED 

and

<133>[S=88209541] (     sip_stack)(73281971  )   TcpTransportObject#430::DispatchQueueEvent(EVENT_RECEIVER_DISCONNECT) - Closing connection  

I need to match both and get specific group. I use this pattern:

<(.*)>\[S=(.*)\] (\[SID:(.*?)\])?(.*)

What I match is:

Match0: <133>[S=88121248] [SID:1073710562] ......the full line  
Group1: 133  
Group2: 88121248] [SID:1073710562  
Group3:  
Group4:  
Group5: ......the full line  

Match1: <133>[S=88209541] ......the full line  
Group1: 133  
Group2: 88209541  
Group3:   
Group4:  
Group5: ......the full line  

What I need:

Match0: <133>[S=88121248] [SID:1073710562] ......the full line  
Group1: 133  
Group2: 88121248  
Group3: 1073710562  
Group4:  
Group5: ......the full line  


Match1: <133>[S=88209541] ......the full line  
Group1: 133  
Group2: 88209541  
Group3:  
Group4:  
Group5: ......the full line  

To resume the match on both are fine, but grouping is not. The second string is matched and grouped fine, but the first not.

You make a typical mistake by using the greedy star .* and thereby overshooting your intended match.

To match anything between two delimiters, it's better to use negated character class instead, for example <([^>]*)> between < and > .

So this would work:

^<([^>]*)>\[S=([^\]]*)\]\s+(?:\[SID:([^\]]*)\]\s+)?(.*)

Breakdown:

^<([^>]*)>                # something between < and > at the start of the line
\[S=([^\]]*)\]\s+         # something between "[S=" and "]"
(?:\[SID:([^\]]*)\]\s+)?  # something between "[SID:" and "]", optional
(.*)                      # rest of the string

Note the non-capturing parentheses (?:...) that get rid of the unused group in the result.

Matches:

MATCH 1
1.  [1-4]   `133`
2.  [8-16]  `88121248`
3.  [23-33] `1073710562`
4.  [35-218]    `(   lgr_psbrdif)(72811810  )   #38:OpenChannel:on Trunk 0 BChannel:9 CID=38 with VoiceCoder: g711Alaw64k20 VbdCoder: InvalidCoder255 DetectorSide: 0 FaxModemDet NO_FAX_MODEM_DETECTED `

MATCH 2
1.  [220-223]   `133`
2.  [227-235]   `88209541`
3.  n/a
4.  [237-360]   `(     sip_stack)(73281971  )   TcpTransportObject#430::DispatchQueueEvent(EVENT_RECEIVER_DISCONNECT) - Closing connection  `

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM