简体   繁体   中英

Tcl regexp : Why '+' does not match as many as possible?

I am using TCL8.4. In the following expression, I tried to fetch the numerical value using ([0-9]+). But it does not matches as many as possible though man page shows '+' is meant for matching as many as possible ( ref : http://wiki.tcl.tk/396 ) Also, please share/suggest any better way of doing what I want to do.

%set a {
NOTPLD STATS:
              Bps:                    0; pps:                    0; Bytes:                    0; Packets:                    4535

TPLD STATS:
          Bps:                    0; pps:                    0; Bytes:                    0; Packets:                    4535

}
%
% regexp {NOTPLD STATS:(.*?)Packets:[\s]+([0-9]+)} $a t1 t2 c 
1
% set c
4

See Interaction Between Quantifiers with Different Greediness :

All quantifiers in a branch get switched to the same greediness, so adding a non-greedy quantifier makes the other quantifiers in the branch implicitly non-greedy as well.

Thus, your ([0-9]+) is interpreted as ([0-9]+?) , and it matches one or more digits, but as few as possible to return a valid match. All lazy subpatterns at the end of patterns only match zero ( *? ) or one ( +? ) symbols.

A simple solution is just to add a trailing character, here, it is a newline (or whitespace):

regexp {NOTPLD STATS:(.*?)Packets:[\s]+([0-9]+)\s} $a t1 t2 c
                                                ^

See IDEONE demo

If the value can be at the end of the string, use an alternation (?:\\s|$) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM