简体   繁体   中英

Using a RegEx.Replace to replace matches with captures from two expressions

I'm looking for a way to use RegEx to capture groups from two separate expressions, and use them for a search and replace in a single string with the captures shared between the two replaces.

For example:

string input_a = "abc-def-ghi";
string input_b = "123-4567-89";

string pattern_a = "(?<first>def)";  // captures 'def' from input_a and 
                                     // names the capture as 'first'
string pattern_b = "(?<second>456)"; // captures '456' from input_b and
                                     // names the capture as 'second'

string translation_a = "A=${first}${second}"; // replacement strings use the named
string translation_b = "B=${second}${first}"; // captures from both RegExs

// I want the results of the replace to give:

Console.Write("Result A: abc-A=def456-ghi"); // result of regex search and replace
                                             // matches on 'def' and replaces this
                                             // with 'A=' followed by 'def' from the 
                                             // first expression and '456' from the
                                             // second expression

Console.Write("Result B: 123-B=456def-789"); // same thing again but the other way
                                             // around

My inputs/patterns/translations are all not known at runtime as they are user configurable and stored in a database.

Can anyone suggest a neat elegant way to do this?

UPDATE

To give a little more context to my question, here is a real life example. I'm using this in a telecoms system that processes incoming calls. As calls come in, they have two pieces of information: the dialled number (known at the DDI) and the calling number (known as the CLI).

The system I'm creating needs to route numbers in a very dynamic configurable way using a list of 'rules' stored in a database, which are in fact a set of regular expressions. The rules need to be updated via a GUI, so nothing can be hard coded.

This part of the system does a kind of pre-routing translation on the incoming calls. Some examples include (this is all fictitious data):

DDI              CLI
800123400        01373000001
4150800123401    01373000002
123402077000000  01373000003

I need the calls to 'come out the other side' with their DDI and CLI translated. My database holds: DDISearchPattern, DDITranslation, CLISearchPattern, CLITranslation.

My first simple rule is:

DDISearchPattern = "^0?(?<ddi>\d{9})$"
DDITranslation   = "0${ddi}"
CLISearchPattern = "^0?(?<cli>\d{9})$"
CLITranslation   = "0${cli}"

Sometimes calls hit the system missing the leading zero. This rule will add it back on.

The next rule is to strip of a 415 prefix.

DDISearchPattern = "^4150?(?<ddi>\d{9})$"
DDITranslation   = "0${ddi}"
CLISearchPattern = "^0?(?<cli>\d{9})$"
CLITranslation   = "0${cli}"

But here is my problem. The in the last example (DDI = 123402077000000) I want to append the CLI to the end of the DDI, so I want to end up with 12340207700000001373000001.

I would like to be able to do this:

DDISearchPattern = "^12340?(?<ddi?\d{9})$"
DDITranslation   = "12340${ddi}${cli}"
CLISearchPattern = "^0?(?<cli>\d{9})$"
CLITranslation   = "0${cli}"

But unfortunately, the ${cli} capture group is part of the CLI regex, not the DDI regex. How can I 'load up' one regex with the captured groups from the other regex, so that I can do a replace using the captures from both?

I have found a way to do this, but it's a very messy way using a regex to replace on @'\\$\\{cli\\}' . I really want to find a simpler better way.

edit
Ok, I see what your trying to do. Lets say the engine doesen't retain grouping values between regexes.

What this would take is actually 2 passes of the each expression. First pass to capture first/second, second pass to do the substitution with saved off firs/second values from pass 1.

string pattern_a = "(?<first>def)";
string pattern_b = "(?<second>456)";

// run a match with pattern_a on input_a
string res_first = "${first}";

// run a match with pattern_b on input_b
string res_second = "${second}";

// run a replace pattern_a on input_a using res_first res_second
// run a replace pattern_b on input_b using res_first res_second
etc, ...

end

If I understand you correctly..
I don't know .net yet. But usually, a regex result is valid up until the next regex, after which the previous results are now invalid.

But if not, then you would need some more independent names.

string input_a = "abc-def-ghi";
string input_b = "123-4567-89";

string pattern_a = "^(?<apre>.*)(?<first>def)(?<apost>.*)$";
string pattern_b = "^(?<bpre>.*)(?<second>456)(?<bpost>.*)$";

string translation_a = "${apre}A=${first}${second}${apost}";
string translation_b = "${bpre}B=${second}${first}${bpost}";

If it is invalid, you need to save off the results after the first run. Something like this (warning, I am not familiar with catenation in .net):

string input_a = "abc-def-ghi";
string input_b = "123-4567-89";

string pattern_a = "^(?<pre>.*)(?<first>def)(?<post>.*)$";
string pattern_b = "^(?<pre>.*)(?<second>456)(?<post>.*)$";

// Do the regex for input_a
// Save off the capture vars here..

string A_pre   =  "${pre}"; 
string A_first =  "${first}"; 
string A_post  =  "${post}"; 

// Do the regex for input_b

string translation_a = A_pre + "A=" + A_first + "${second}";
string translation_b = "${pre}B=${second}" + A_first + "${post}";

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM