正则表达式基于具有嵌套尖括号的尖括号选择文本

Question

I wanted to select text based on below scenarios. 我想根据以下情况选择文本。 I tried couple of regex but still I am not able to cover all the scenarios using one regex. 我尝试了几次正则表达式，但仍然无法使用一个正则表达式涵盖所有情况。

Set 1 套装1

<x> <y> Result should be two groups <x> and <y> <x> <y>结果应为两组<x>和<y>

<Name> <NewName> Result should be two groups <Name> and <NewName> <Name> <NewName>结果应为两组<名称>和<新名称 >

Set 2 套装2

sampletext <!PARSE<sampletext>><.value> Result should be two groups sampletext and <!PARSE<sampletext>><.value> sampletext <!PARSE<sampletext>><.value>结果应为两组sampletext和<！PARSE <sampletext >> <。value>

found <!PARSE<XYZ.ID>notfound> Result should be two groups <found> and <!PARSE<XYZ.ID>notfound> found <!PARSE<XYZ.ID>notfound>结果应该是两个组<found>和<！PARSE <XYZ.ID> notfound>

<XYZ.IDXX> notfound Result should be two groups <XYZ.IDXX> and notfound <XYZ.IDXX> notfound结果应该是两组<XYZ.IDXX>和NOTFOUND

notFoundString <!PARSE<XYZ.IDXX>notfound> Result should be two groups <notFoundString> and <!PARSE<XYZ.IDXX>notfound> notFoundString <!PARSE<XYZ.IDXX>notfound>结果应为两组<notFoundString>和<！PARSE <XYZ.IDXX> notfound>

notFoundEmpty <!PARSE<XYZ.IDXX>> Result should be two groups <notFoundEmpty> and <!PARSE<XYZ.IDXX>> notFoundEmpty <!PARSE<XYZ.IDXX>>结果应为两组<notFoundEmpty>和<！PARSE <XYZ.IDXX >>

Set 3 套装3

<thread.end> <thread.start> Result should be two groups <thread.end> and <thread.start> <thread.end> <thread.start>结果应为两组<thread.end>和<thread.start>

<!MINUS <thread.end> <thread.start>> 1000 Result should be two groups <!MINUS <thread.end> <thread.start>> and 1000 <!MINUS <thread.end> <thread.start>> 1000结果应为两组<！MINUS <thread.end> <thread.start >>和1000

thread.duration <!DIVISION <!MINUS <thread.end> <thread.start>> 1000> Result should be two groups thread.duration and <!DIVISION <!MINUS <thread.end> <thread.start>> 1000> thread.duration <!DIVISION <!MINUS <thread.end> <thread.start>> 1000>结果应为两组thread.duration和<！DIVISION <！MINUS <thread.end> <thread.start >> 1000>

Set 4 套装4

1234 5678 Result should be two groups 1234 and 5678 1234 5678结果应为两组1234和5678

add.sample.result <!ADD 1234 5678> Result should be two groups add.sample.result and <NewName> add.sample.result <!ADD 1234 5678>结果应为两组add.sample.result和<NewName>

Regexs I tried 我尝试过的正则表达式

<([^>]*)>|(\\S+) This works fine in Set 1 and 4, but in Set 2 and 3, it captures more groups than required. <([^>]*)>|(\\S+)在Set 1和4中工作正常，但在Set 2和3中，它捕获的组多于所需的组。 https://regexr.com/3si0v https://regexr.com/3si0v
<(.*)>|(\\S+) This works fine for Set 2 and 4, but gives wrong results in Set 1 and 3. https://regexr.com/3si12 <(.*)>|(\\S+)对于设置2和4可以正常工作，但是在设置1和3中给出错误的结果。https://regexr.com/3si12

I need regex which give expected results as mentioned above in all sets. 我需要正则表达式，它可以在所有集合中提供如上所述的预期结果。

Answer 1

You may use 您可以使用

((?:<[^<>]*(?:<[^<>]*(?:<[^<>]*>[^<>]*)*>[^<>]*)*>)+)|(\S+)

See the regex demo 见正则表达式演示

It either matches and captures into 2 groups (?:<[^<>]*(?:<[^<>]*(?:<[^<>]*>[^<>]*)*>[^<>]*)*>)+ or \\S+ patterns. 它要么匹配并捕获为2组(?:<[^<>]*(?:<[^<>]*(?:<[^<>]*>[^<>]*)*>[^<>]*)*>)+或\\S+模式。

Details 细节

(?:<[^<>]*(?:<[^<>]*(?:<[^<>]*>[^<>]*)*>[^<>]*)*>)+ - matches 1 or more consecutive occurrences of (?:<[^<>]*(?:<[^<>]*(?:<[^<>]*>[^<>]*)*>[^<>]*)*>)+ -匹配1个或多个连续出现的
- < - a < < -一个<
- [^<>]* - 0+ chars other than < and > [^<>]* -除<和>以外的0+个字符
- (?:<[^<>]*(?:<[^<>]*>[^<>]*)*>[^<>]*)* - 0+ sequences of (?:<[^<>]*(?:<[^<>]*>[^<>]*)*>[^<>]*)* -0个以上的序列
  - <[^<>]*(?:<[^<>]*>[^<>]*)*> - Nested level 1: <[^<>]*(?:<[^<>]*>[^<>]*)*> -嵌套级别1：
  - <[^<>]* - < and 0+ chars other than < and > <[^<>]* - <和0+比其他字符<和>
  - (?:<[^<>]*>[^<>]*)* - Nested level 2: 0+ sequences of (?:<[^<>]*>[^<>]*)* -嵌套级别2：0+个
    - < - a < < -一个<
    - [^<>]* - 0+ chars other than < and > [^<>]* -除<和>以外的0+个字符
    - > - a > > -一个>
    - [^<>]* - 0+ chars other than < and > [^<>]* -除<和>以外的0+个字符
  - > - a > char > -一个>字符
  - [^<>]* - 0+ chars other than < and > [^<>]* -除<和>以外的0+个字符
- > - a > > -一个>
| - or - 要么
\\S+ - 1+ non-whitespace chars. \\S+ -1+个非空白字符。

正则表达式基于具有嵌套尖括号的尖括号选择文本

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-07-17 10:07:33

正则表达式基于具有嵌套尖括号的尖括号选择文本

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-07-17 10:07:33

解决方案1
2 已采纳 2018-07-17 10:07:33