简体   繁体   中英

parsing strings in C# the right way

How to parse string right way in C#? I have a next pattern

"|?\d+(:\d+([lLrR])?)?"

bool bV = is there a vertical line;
then int iL = number;
skip ':'
then int iD = number;
and bool bS = is there a letter.

If I write in C, I can make it by myself like follows:

char* s = str; char* s1 = str1;
if( *s == '|' ) { bV == 1; s++; }
while( isdigit (*s) )
{ *s1 = s++; s1++; } s1 = 0;
iL = itoa (str1);
// and so on..

But in C# it looks stupid. I read about String.Split and Regex.Split , but after it works I need write branches too. I think there is better way. Do you know it?


I'll try to explain. Is there a function specially for parsing? like..

parseString(" {'|' | ''} {0} : {1} {'L'|'R'}", string, int, int, string);

like ReadConsole has places where expect the numerical value. I just asking. Framework has many interesting functions.

For the present I trying to use regexp and have something like this:

 {
  string pattern = @"(?<bVect>[|]?)\s*(?<iDest>\d+)(\s*:\s*(?<iLvl>\d+)*\s*(?<bSide>[LRlr]?)?)";   
  RegexOptions option = RegexOptions.IgnoreCase; 
  Regex newReg = new Regex(pattern,option);  
  MatchCollection matches = newReg.Matches(str);

  // Here starting 'Branches' = 'if-statements' I mean
  if(matches.Count != 1) { /* error */}
  bV = (matches[0].Groups["bVect"].Value == "|");
  bS = (matches[0].Groups["bSide"].Value != "");
  string ss = matches[0].Groups["iDest"].Value;
  if (ss != "")
   iD = Convert.ToInt32 (ss);
  ss = matches[0].Groups["iLvl"].Value;
  if( ss != "" )
   iL = Convert.ToInt32 (ss);
 }

You can use the [] operator to acces the characters of the string directly. And from this point, you can use your already working C algorithm in C#.

The only difference is that obviously you can't use pointers (unless you want to use unsafe code, but it's not recommended). So you have too keep an index to the array instead of using pointer arithmetic. Consider something like this:

int index = 0;
if( s[index] == '|' ) { bV == 1; index++ }
/// and so on

I think you are looking for sth like:

    var sourceString = @"|?\d+(:\d+([lLrR])?)?".ToCharArray();

    int length = sourceString.Length;
    int i = 0;
    while (i < length && char.IsDigit(sourceString[i++]))
    {
        // rest of the code

    }

If I understand you right, you may use your pattern with Regex.Match(pattern, input, RegexOptions.SingleLine) . The resulting Match object has a Property Groups where you can extract the desired Data. (For the first match, or use multiple Match results using Regex.Matches(...) to get an ICollection of matches.)

Each opening brace will produce a group, so you should add braces around the vertical line and all groups you are interested in: (|)?(\\d+)(:(\\d+)([lLrR])?)?

Now if match.Success , you will get:

/* match.Groups[0] is the whole match. */
bool bV = match.groups[1].Success;
int iL = int.Parse(match.groups[2].Value);
int iD = int.Parse(match.groups[4].Value);
bool bS = match.groups[5].Success;

为什么不使用正则表达式命名的捕获组?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM