简体   繁体   中英

regex/linq to replace consecutive characters with count

I have the following method (written in C#/.NET). Input text consist only of letters (no digits). Returned value is another text in which groups of more than two consecutive characters are replaced with one the character preceded with a count of repetitions. Ex.: aAAbbbcccc -> aAA3b4c

public static string Pack(string text)
{
    if (string.IsNullOrEmpty(text)) return text;

    StringBuilder sb = new StringBuilder(text.Length);

    char prevChar = text[0];
    int prevCharCount = 1;

    for (int i = 1; i < text.Length; i++)
    {
        char c = text[i];
        if (c == prevChar) prevCharCount++;
        else
        {
            if (prevCharCount > 2) sb.Append(prevCharCount);
            else if (prevCharCount == 2) sb.Append(prevChar);
            sb.Append(prevChar);

            prevChar = c;
            prevCharCount = 1;
        }
    }

    if (prevCharCount > 2) sb.Append(prevCharCount);
    else if (prevCharCount == 2) sb.Append(prevChar);
    sb.Append(prevChar);

    return sb.ToString();
}

The method is not too long. But does any one has an idea how to do that in a more concise way using regex? Or LINQ?

How about:

static readonly Regex re = new Regex(@"(\w)(\1){2,}", RegexOptions.Compiled);
static void Main() {
    string result = re.Replace("aAAbbbcccc",
         match => match.Length.ToString() +  match.Value[0]);   
}

The regex is a word char, followed by the same (back-ref) at least twice; the lamba takes the length of the match ( match.Length ) and appends the first character ( match.Value[0] )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM