简体   繁体   English

regex / linq用count替换连续的字符

[英]regex/linq to replace consecutive characters with count

I have the following method (written in C#/.NET). 我有以下方法(用C#/。NET编写)。 Input text consist only of letters (no digits). 输入文本仅包含字母(无数字)。 Returned value is another text in which groups of more than two consecutive characters are replaced with one the character preceded with a count of repetitions. 返回值是另一种文本,其中两个以上连续字符的组被替换为一个字符,该字符前面带有重复计数。 Ex.: aAAbbbcccc -> aAA3b4c 例如:aAAbbbcccc-> aAA3b4c

public static string Pack(string text)
{
    if (string.IsNullOrEmpty(text)) return text;

    StringBuilder sb = new StringBuilder(text.Length);

    char prevChar = text[0];
    int prevCharCount = 1;

    for (int i = 1; i < text.Length; i++)
    {
        char c = text[i];
        if (c == prevChar) prevCharCount++;
        else
        {
            if (prevCharCount > 2) sb.Append(prevCharCount);
            else if (prevCharCount == 2) sb.Append(prevChar);
            sb.Append(prevChar);

            prevChar = c;
            prevCharCount = 1;
        }
    }

    if (prevCharCount > 2) sb.Append(prevCharCount);
    else if (prevCharCount == 2) sb.Append(prevChar);
    sb.Append(prevChar);

    return sb.ToString();
}

The method is not too long. 该方法不太长。 But does any one has an idea how to do that in a more concise way using regex? 但是,有谁知道如何使用正则表达式以更简洁的方式做到这一点? Or LINQ? 还是LINQ?

How about: 怎么样:

static readonly Regex re = new Regex(@"(\w)(\1){2,}", RegexOptions.Compiled);
static void Main() {
    string result = re.Replace("aAAbbbcccc",
         match => match.Length.ToString() +  match.Value[0]);   
}

The regex is a word char, followed by the same (back-ref) at least twice; 正则表达式是一个单词char,其后至少两次相同的(back-ref)。 the lamba takes the length of the match ( match.Length ) and appends the first character ( match.Value[0] ) 兰巴获取比赛的长度( match.Length )并附加第一个字符( match.Value[0]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM