简体   繁体   中英

Regex in C# How to replace only capture groups and not non-capture groups

I am writing a regular expression in Visual Studios 2013 Express using C#. I am trying to put single quotes around every single string that includes words and !@#$%^&*()_- except for:

  • and
  • or
  • not
  • empty()
  • notempty()
  • currentdate()
  • any string that already has single quotes around it.

Here is my regex and a sample of what it does: https://regex101.com/r/nI1qP0/1

I want put single quotes only around the capture groups and leave the non-capture groups untouched. I know this can be done with lookarounds, but I don't know how.

You can use this regex:

(?:'[^']*'|(?:\b(?:(?:not)?empty|currentdate)\(\)|and|or|not))|([!@#$%^&*_.\w-]‌​+)

Here ignored matches are not captured and words to be quoted can be retrieved using Match.Groups[1] . You can then add quotes around Match.Groups[1] and get the whole input replaced as you want.

RegEx Demo

You need to use a match evaluator, or a callback method. The point is that you can examine the match and captured groups inside this method, and decide what action to take depending on your pattern.

So, add this callback method (may be non-static if the calling method is non-static):

public static string repl(Match m)
{
    return !string.IsNullOrEmpty(m.Groups[1].Value) ?
        m.Value.Replace(m.Groups[1].Value, string.Format("'{0}'", m.Groups[1].Value)) :
        m.Value;
}

Then, use an overload of Regex.Replace with the match evaluator (=callback method) :

var s = "'This is not captured' but this is and not or empty() notempty() currentdate() capture";
var rx = new Regex(@"(?:'[^']*'|(?:\b(?:(?:not)?empty|currentdate)\(\)|and|or|not))|([!@#$%^&*_.\w-]+)");
Console.WriteLine(rx.Replace(s, repl));

Note you can shorten the code with a lambda expression:

Console.WriteLine(rx.Replace(s, m => !string.IsNullOrEmpty(m.Groups[1].Value) ?
    m.Value.Replace(m.Groups[1].Value, string.Format("'{0}'", m.Groups[1].Value)) :
    m.Value));

See IDEONE demo

Instead of trying to ignore the strings with words and!@#$%^&*()_- in them, I just included them in my search, placed an extra single quote on either end, and then remove all instances of two single quotes like so:

 // Find any string of words and !@#$%^&*()_- in and out of quotes.
 Regex getwords = new Regex(@"(^(?!and\b)(?!or\b)(?!not\b)(?!empty\b)(?!notempty\b)(?!currentdate\b)([\w!@#$%^&*())_-]+)|((?!and\b)(?!or\b)(?!not\b)(?!empty\b)(?!notempty\b)(?!currentdate\b)(?<=\W)([\w!@#$%^&*()_-]+)|('[\w\s!@#$%^&*()_-]+')))", RegexOptions.IgnoreCase);
 // Find all cases of two single quotes
 Regex getQuotes = new Regex(@"('')");

 // Get string from user
 Console.WriteLine("Type in a string");
 string search = Console.ReadLine();

 // Execute Expressions.
 search = getwords.Replace(search, "'$1'");
 search = getQuotes.Replace(search, "'");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM