简体   繁体   中英

Regex replace multiple groups

I would like to use regular expressions to replace multiple groups with corresponding replacement string.

Replacement table:

  • "&" -> "__amp"
  • "#" -> "__hsh"
  • "1" -> "5"
  • "5" -> "6"

For example, for the following input string

"a1asda&fj#ahdk5adfls"

the corresponding output string is

"a5asda__ampfj__hshahdk6adfls"

Is there any way to do that?

Given a dictionary that defines your replacements:

IDictionary<string, string> map = new Dictionary<string, string>()
{
    {"&","__amp"},
    {"#","__hsh"},
    {"1","5"},
    {"5","6"},
};

You can use this both for constructing a Regular Expression, and to form a replacement for each match:

var str = "a1asda&fj#ahdk5adfls";
var regex = new Regex(String.Join("|",map.Keys));
var newStr = regex.Replace(str, m => map[m.Value]);
// newStr = a5asda__ampfj__hshahdk6adfls

Live example: http://rextester.com/rundotnet?code=ADDN57626

This uses a Regex.Replace overload which allows you to specify a lambda expression for the replacement.


It has been pointed out in the comments that a find pattern which has regex syntax in it will not work as expected. This could be overcome by using Regex.Escape and a minor change to the code above:

var str = "a1asda&fj#ahdk5adfls";
var regex = new Regex(String.Join("|",map.Keys.Select(k => Regex.Escape(k))));
var newStr = regex.Replace(str, m => map[m.Value]);
// newStr = a5asda__ampfj__hshahdk6adfls

How about using string.Replace() ?

string foo = "a1asda&fj#ahdk5adfls"; 

string bar = foo.Replace("&","__amp")
                .Replace("#","__hsh")
                .Replace("5", "6")
                .Replace("1", "5");

Given a dictionary like in the other answers, you can use an "aggregate" to map each pattern in the dictionary to a replacement. This will give you far more flexibility that the other answers, as you can have different regex options for each pattern.

For example, the following code will "romanize" greek text ( https://en.wikipedia.org/w/index.php?title=Romanization_of_Greek&section=3#Modern_Greek , Standard/UN):

var map = new Dictionary<string,string>() {
    {"α[ύυ](?=[άαβγδέεζήηίΐϊιλμνόορύΰϋυώω])", "av"}, {"α[ύυ]", "af"}, {"α[ϊΐ]", "aï"}, {"α[ιί]", "ai"}, {"[άα]", "a"},
    {"β", "v"}, {"γ(?=[γξχ])", "n"}, {"γ", "g"}, {"δ", "d"},
    {"ε[υύ](?=[άαβγδέεζήηίΐϊιλμνόορύΰϋυώω])", "ev"}, {"ε[υύ]", "ef"}, {"ει", "ei"}, {"[εέ]", "e"}, {"ζ", "z"},
    {"η[υύ](?=[άαβγδέεζήηίΐϊιλμνόορύΰϋυώω])", "iv"}, {"η[υύ]", "if"}, {"[ηήιί]", "i"}, {"[ϊΐ]", "ï"},
    {"θ", "th"}, {"κ", "k"}, {"λ", "l"}, {"\\bμπ|μπ\\b", "b"}, {"μπ", "mb"}, {"μ", "m"}, {"ν", "n"},
    {"ο[ιί]", "oi"}, {"ο[υύ]", "ou"}, {"[οόωώ]", "o"}, {"ξ", "x"}, {"π", "p"}, {"ρ", "r"},
    {"[σς]", "s"}, {"τ", "t"}, {"[υύϋΰ]", "y"}, {"φ", "f"}, {"χ", "ch"}, {"ψ", "ps"}
};

var input = "Ο Καλύμνιος σφουγγαράς ψυθίρισε πως θα βουτήξει χωρίς να διστάζει."; 
map.Aggregate(input, (i, m) => Regex.Replace(i, m.Key, m.Value, RegexOptions.IgnoreCase));

returning (without modifying the "input" variable:

"o kalymnios sfoungaras psythirise pos tha voutixei choris na distazei."

You can of course use something like:

foreach (var m in map) input = Regex.Replace(input, m.Key, m.Value, RegexOptions.IgnoreCase);

which does modify the "input" variable.

Also you can add this to improve performance:

var remap = new Dictionary<Regex, string>();
foreach (var m in map) remap.Add(new Regex(m.Key, RegexOptions.IgnoreCase | RegexOptions.Compiled), m.Value);

cache or make static the remap dictionary and then use:

remap.Aggregate(input, (i, m) => m.Key.Replace(i, m.Value));

Similar to Jamiec's answer, but this allows you to use regexes that don't match the text exactly, eg \\. can't be used with Jamiec's answer, because you can't look up the match in the dictionary.

This solution relies on creating groups, looking up which group was matched, and then looking up the replacement value. It's a more complicated, but more flexible.

First make the map a list of KeyValuePairs

var map = new List<KeyValuePair<string, string>>();           
map.Add(new KeyValuePair<string, string>("\.", "dot"));

Then create your regex like so:

string pattern = String.Join("|", map.Select(k => "(" + k.Key + ")"));
var regex = new Regex(pattern, RegexOptions.Compiled);

Then the match evaluator becomes a bit more complicated:

private static string Evaluator(List<KeyValuePair<string, string>> map, Match match)
{            
    for (int i = 0; i < match.Groups.Count; i++)
    {
        var group = match.Groups[i];
        if (group.Success)
        {
            return map[i].Value;
        }
    }

    //shouldn't happen
    throw new ArgumentException("Match found that doesn't have any successful groups");
}

Then call the regex replace like so:

var newString = regex.Replace(text, m => Evaluator(map, m))

Just wanted to share my experience with Jamiec and Costas solutions.

If you have a problem like this: The given key '<searched param>' was not present in the dictionary.

Bear in mind that putting regex patterns in the dictionary keys

IDictionary<string, string> map = new Dictionary<string, string>()
{
   {"(?<=KeyWord){","("},
   {"}",")"}
};

and using it like so

var regex = new Regex(String.Join("|",map.Keys));
var newStr = regex.Replace(str, m => map[m.Value]);

or so

var newStr = Regex.Replace(content, pattern, m => replacementMap[m.Value]);

may throw the aforementioned exception, because the pattern is executed before the replacement comparison, leaving only the matches to be compared to the regex keys in the dictionary. This way the key and the match may differ and throw the exception.

'(?<=KeyWord){' != '{'

So here is my solution:

I had to replace a "{" that followed a KeyWord and the corresponding "}" after that with "(" and ")" respectively.

In short making this

@"some random text KeyWord{"Value1", "Value2"} some more 
random text";

into this

@"some random text KeyWord('"Value1", "Value2"') some more 
    random text";

Important bits

IDictionary<string, string> map = new Dictionary<string, string>()
{
    {"{","('"},
    {"}","')"}
};

var content = @"some random text KeyWord{"Value1", "Value2"} some more 
    random text";
var pattern = "((?<=KeyWord){)|((?<=\")})";
var newStr = Regex.Replace(content, pattern, m => map[m.Value]);

Hope this jumble of words will be useful to someone

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM