简体   繁体   中英

C# string manipulation by rules using regex

I have a situation..
Having a string which can contain numbers,letters, and some symbols, I want to make an extraction from it or to make some replacements based on some "rules". I thing the best is to give some examples of possible situations and what I want to do (display):

String           Display1   or  Display2

AB_X345           X345         or  ###X345  
AB_1234            1234        or  ###1234  
X987_TEXT_4567    X9874567     or  X987######4567  
X987TEXT4567      X9874567     or  X987####4567  
X798TEXT          X798         or  X798####  
789TEXT            789         or  789####  
X400              X400         or  X400   

So practically when I find an X followed by numbers I want to display them. If some text appear, I don't want it displayed or I want it masked with a character(#). In case no X is present, I want to display only the numbers. Is Regex the easyest way of doing this? (I am not familiar with regex-just had a bird's eye view on it). Can all the rules be gathered in a single regex expression or is to complicated?

Thank you for any sugestions

That's easy:

resultString = Regex.Replace(subjectString, 
    @"\D       # Match a non-digit character
    (?<!       # unless...
     X         #  it's an X
     (?=\d)    #  which is followed by a digit.
    )          # End of lookbehind", 
    "", RegexOptions.IgnorePatternWhitespace);

Change the last line to

    "#", RegexOptions.IgnorePatternWhitespace);

to mask the characters with # instead of removing them.

Try this for Display 1: @"(?<![A-Za-z])X[0-9]+|[0-9]+"

var rx = new Regex(@"(?<![A-Za-z])X[0-9]+|[0-9]+");
var matches = rx.Matches("X987_TEXT_4567");

var result = "";

foreach (Match match in matches)
{
    result += match.Value;
}

Under C# 4.0 you can even do

var rx = new Regex(@"(?<![A-Za-z])(?<1>X[0-9]+)?(?:(?:[^0-9]*)(?<1>[0-9]+))*");
var match = rx.Match("X987_TEXT_4567_123");
var res = string.Concat(match.Groups[1].Captures.OfType<Capture>().Select(p => p.Value));

But the regex at this point becomes a little unreadable :-)

Try this regex:

X\\d|\\d

OR

/X\\d|\\d/g

This will select only digits or digit starts with 'X'

Try this one, check the example below and test it.

\d?X[0-9]+|[0-9]

Example:
http://rubular.com/r/cA5Y49pCtV

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM