简体   繁体   中英

Regular expression for matching a character except the first occurrence

How to match character in the text except the first occurrence?

98C546CC456C67 should match 98C546 456 67 456 67

This problem is a classic case of the technique explained in this question to "regex-match a pattern, excluding..."

We can solve it with a beautifully-simple regex:

^[^C]*C|(C)

The left side of the alternation | matches the beginning of the string up to the first C . We will ignore this match. The right side matches and captures C to Group 1, and we know they are the right ones because they were not matched by the expression on the left.

This program shows how to use the regex (see the results at the bottom of the online demo ):

var subject = '98C546CC456C67';
var regex = /^[^C]*C|(C)/g;
var group1Caps = [];
var match = regex.exec(subject);

// put Group 1 captures in an array
while (match != null) {
    if( match[1] != null ) group1Caps.push(match[1]);
    match = regex.exec(subject);
}

document.write("<br>*** Matches ***<br>");
if (group1Caps.length > 0) {
   for (key in group1Caps) document.write(group1Caps[key],"<br>");
   }

Reference

Unfortunately, JavaScript's regex engine is severely limited. You can't do that in a single regex. The best solution probably would be to do

txt = subject.match(/[A-Z]/ig);  // or /[A-Z]+/ig, if CC should be a single match

and discard the first match.

In a flavor that supports quantifiers in lookbehind like .NET Regex for example you can use lookbehind to look for preceding characters

foreach(Match m in Regex.Matches("98C546CC456C67", @"(?<=C.*?)C")){
    Console.WriteLine(m.ToString() + " at position " + m.Index);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM