简体   繁体   English

正则表达式-匹配多个组

[英]Regex - Match several groups

I'm trying to extract from the following example string (all in one line): 我正在尝试从以下示例字符串中提取(全部在一行中):

First Note Type[br]03/11/2015          12:51:24            USR123[br]Now is the time for all good men to come to the aid of their country[br]Second Note Type[br]03/11/2015          16:26:03            USR456[br]The quick brown fox jumped over the lazy dog.

2 matches with 5 groups each , eg: 2个匹配项每组5个组 ,例如:

Match 1 : 比赛1

  • G1 -> 'First Note Type' G1->“初音类型”
  • G2 -> '03/11/2015' G2-> '03 / 11/2015'
  • G3 -> '12:51:24' G3-> '12:51:24'
  • G4 -> 'USR123' G4->'USR123'
  • G5 -> 'Now is the time for all good men to come to the aid of their country[br]' G5->“现在是所有好人来帮助他们的国家的时候了[br]”

Match 2 : 比赛2

  • G1 -> 'Second Note Type' G1->“第二笔记类型”
  • G2 -> '03/11/2015' G2-> '03 / 11/2015'
  • G3 -> '16:26:03' G3-> '16:26:03'
  • G4 -> 'USR456' G4->'USR456'
  • G5 -> 'The quick brown fox jumped over the lazy dog.' G5->“敏捷的棕色狐狸跳过了那只懒狗。”

So far I've only managed to match the first 4 groups using the following expression: 到目前为止,我仅使用以下表达式设法匹配了前4个组:

([a-zA-Z\s]+)\\[br\\\]([0-9]+/[0-9]+/[0-9]+)\s+([0-9]+:[0-9]+:[0-9]+)\s+([a-zA-Z0-9]+)\\[br\\]

could not get also the fifth (G5) group, I've tried adding a (.+) but will result in only a single match instead of n 也无法获得第五个(G5)组,我尝试添加(.+)但只会导致单个匹配而不是n

can anyone point me in the right direction? 谁能指出我正确的方向?

When you use (.+) , it matches 1 or more symbols other than a newline, as many times as possible (so, it will eat up everything up to the end of line). 当您使用(.+) ,它会尽可能多地匹配除换行符以外的1个或多个符号(因此,它将耗尽所有内容,直到行尾)。

You can match it with the following regex: 您可以将其与以下正则表达式匹配:

([a-zA-Z\s]+)\[br]([0-9]+/[0-9]+/[0-9]+)\s+([0-9]+:[0-9]+:[0-9]+)\s+([a-zA-Z0-9]+)\[br]([^[]*(?:\[(?!br])[^[]*)*(?:\[br])?)

See regex demo 正则表达式演示

I added ([^[]*(?:\\[(?!br])[^[]*)*(?:\\[br])?) part. 我添加了([^[]*(?:\\[(?!br])[^[]*)*(?:\\[br])?)部分。 It matches everything other than [br] . 它匹配[br]以外的所有内容。 A more detailed breakdown: 更详细的细分:

  • [^[]* - match 0 or more characters other than [ [^[]* -匹配0个或多个[
  • (?:\\[(?!br])[^[]*)* - match 0 or more sequences of... (?:\\[(?!br])[^[]*)* -匹配0个或多个序列...
    • \\[(?!br]) - a literal [ not followed with br] \\[(?!br]) -文字[不跟br]
    • [^[]* - 0 or more characters other than [ . [^[]* -除[以外的0个或更多字符。
  • (?:\\[br])? - matches 1 or 0 times a literal sequence [br] -匹配1或0次文字序列[br]

Results obtained with your string as input: 使用您的字符串作为输入获得的结果:

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM