简体   繁体   中英

RegEx: replace with a new string a substring of a matched token/substring

I need to check the following matches in a string:

"SIN ", " SIN", " SIN ", "SX ", " SX", " SX ", "RIC ", " RIC", " RIC ", "OK ", " OK", " OK "

but replace, when these tokens are matched, only the substrings

"SIN", "SX", "RIC", "OK"

with another text, keeping spaces.

Every line must be considered a different input string.

In detail, I need to:

  • replace with "SINSC" the substrings "SIN" and "SX" inside the tokens "SIN ", " SIN", " SIN ", "SX ", " SX", " SX " every time one of them is matched

  • replace with "RICOM" the substring "RIC" inside the tokens "RIC ", " RIC", " RIC " every time one of them is matched

  • ( matches for "OK ", " OK", " OK " are for another purpose, not replacement, I need them afterwards in the code)

I wrote the following the expression for the first filtering:

(^|\s+)(SIN|SX|RIC|OK)(\s+|$)

and it seems to work (I've considered the case of multiple spaces before and after). I've tried it in the following text:

(you can see demo at: https://regex101.com/r/vIZCGW/2 )

16M2 - SIN - 49.000 KM - SENS - A/C - n.d. - FROM:   - MATRIC.: n.d. - GEAR: n.d. - COD.PROD.RIC.: n.d. - NR.PLATE: 
14I2 - OK - 20.000 KM - A/C - n.d. - FROM: - MATRIC.: n.d. - GEAR: n.d. - COD.PROD.RIC.: n.d. - NR.PLATE: 
11A0 - SIN - 55.000 KM - SQUARE - SENS - A/C
16H0 - n.d. - n.d. - FROM:   - MATRIC.: n.d. - GEAR: n.d._n.d. marce - COD.PROD.RIC.: n.d. - NR.PLATE: 
14N1 - SIN - n.d. - FROM:   - MATRIC.: n.d. - GEAR: n.d._n.d. marce - COD.PROD.RIC.: n.d. - NR.PLATE:  - STEEL
16D2 - SIN - n.d. - FROM:   - MATRIC.: n.d. - GEAR: n.d._n.d. marce - COD.PROD.RIC.: n.d. - NR.PLATE: 
SX 100000 KM        15K2
SIN - 15D1
16P0 - OK - n.d. - FROM:   - MATRIC.: n.d. - GEAR: n.d._n.d. marce - COD.PROD.RIC.: n.d. - NR.PLATE: 
16H0 - SIN - n.d. - FROM:   - MATRIC.: n.d. - GEAR: n.d._n.d. marce - COD.PROD.RIC.: n.d. - NR.PLATE: 
16I1    SIN
14K1 - SIN - n.d. - FROM:   - MATRIC.: n.d. - GEAR: n.d._n.d. marce - COD.PROD.RIC.: n.d. - NR.PLATE: 
SX    14E2
SX     16D1 NO TURBO
SX 110000 KM          15M1
16O2 - SIN 
15J1 - SIN
16L1   SIN DAMAGED
16P2 - SIN - DAMAGED
SX          15E2
SX        9D2
SIN - 130.000 KM - 16J1
OK          13A0
SX        16M0
OK        11A1
OK        12V1
SX 105CV        15P1
OK 105CV        15O2
14A2 - SIN

My questions are basically 2:

  1. How can be the regex replacement code?

  2. Why in the demo at https://regex101.com/r/vIZCGW/2 some lines are highlighted in light blue after the end of the line and the others aren't?

Thanks!

About the regex replacement code you can find more information on how it is done in this article from the .NET documentation https://docs.microsoft.com/en-us/dotnet/standard/base-types/substitutions-in-regular-expressions .

With the regex you have given I would write something like this:

Regex regex = new Regex(@"(^|\s+)(SIN|SX|RIC|OK)(\s+|$)");

string result = regex.Replace(input, m =>
{
    switch (m.Groups[2].Value)
    {
        case "SX":
        case "SIN":
            return "SINSC";
        case "RIC":
            return "RICOM";
        default:
            return m.Value;
    }
});

What this code does is check what was captured by the second group of your regex and replace it with the corresponding value.

About your second question the lines that are highlighted in blue are actually lines that are caught from the first group of your regex.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM