简体   繁体   English

正则表达式捕获由初始定界符分隔的多个组

[英]regex to capture multiple groups separated by an initial delimiter

I have a string like this:我有一个这样的字符串:

|T1| This is some text for the first tag |T2| this is some text for the second tag

I need to parse out the tags and the text that is associated with each one.我需要解析标签和与每个关联的文本。 The tags are not known ahead of time but they are delimited by \|\w+\|标记无法提前知道,但它们由\|\w+\|分隔. .

I know there is something I can do here as far as capturing groups and so on but after messing around in powershell the best I can come up with is to first isolate each pairing using \|\w+\|.* with the ExplicitCapture option and then parse out the tag and text from there.我知道我可以在这里做一些事情来捕获组等等,但是在 powershell 中乱七八糟之后,我能想到的最好的办法是首先使用\|\w+\|.*和 ExplicitCapture 选项来隔离每个配对,并且然后从那里解析出标签和文本。

But that is doing double the work and totally not super-cool haxor.但那是在做双倍的工作,而且完全不是超酷的 haxor。 What's the regex-pro way to do this?执行此操作的正则表达式专业方法是什么?

Edit: Actually I realize that it's late and I misread my results.编辑:实际上我意识到已经晚了,我误读了我的结果。 The above doesn't actually work so now I don't even have a bad solution.以上实际上不起作用,所以现在我什至没有不好的解决方案。

\|(?<tag>\w+)\|(?<text>[^|]*)

Matches |T1| This is some text for the first tag |T2| this is some text for the second tag匹配|T1| This is some text for the first tag |T2| this is some text for the second tag |T1| This is some text for the first tag |T2| this is some text for the second tag

into进入

 |T1| This is some text for the first tag 
 |T2| this is some text for the second tag

EDIT : Use Regex Groups to get parts of match;编辑:使用正则表达式组获取部分匹配项;

var tagName = match.Groups["tag"].Value;
var text = match.Groups["text"].Value;

Swithed to named groups instead of numbered切换到命名组而不是编号

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM