简体   繁体   中英

Regex match being greedy after using non-greedy operator

I have the following the text:

<def id="1">[<note>AA2</note>] Valer:<ex>asd</ex></def>
<def id="2">AWEs: [<note>DDD1</note>]:<ex>rfwc sdad</ex>[<note>CC#2</note>]:<ex>saq www</ex>[<note>POL1</note>]:<ex>Sasd.</ex></def>
<def id="3">Esd: [<note>AAA</note>]:<ex>qw wq.</ex>[<note>PS0</note>]:<ex>sad sadad.</ex></def>
<def id="4" type="L99">[<note>CARSF1</note>] asddds:<ex>ass www.</ex></def>

I'm trying to match when there's a [ immediately after the def tag is opened.

I've this pattern:

<def\s.*?>\[<note>(.*?)<\/note>\](.*?):<ex>(.*?)<\/ex><\/def>

But it matches all lines and I'm not really sure why.

Here's the demo

您的第一个.*应该是[^>]*

Non-greedy means "consume as little as possible to make a successful match". If making a successful match requires consuming additional characters, non-greedy qualifier consumes as many characters as required, stopping as soon as possible.

In your case the non-greedy .*? in the <def\\s...> part continues matching after the closing bracket > , because otherwise there would be no match. On lines two and three it goes all the way to the second note, at which point it matches the rest of the string.

Here is how you can fix this problem:

<def\s[^>]*>\[<note>([^<]*)<\/note>\]([^<]*):<ex>([^<]*)<\/ex><\/def>

Demo.

The idea is to replace all non-greedy expressions with greedy expressions requiring an explicit stop (ie < or > , depending on the context).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM