I need a regex PATTERN
(to be used in C#) that will match integer values WITH 3-digit comma separators but WON'T return the commas in the resulting match value. For example, I need the following code to write 1
, 1234
, and 1234567
to the console:
string text = "This 1 is 1,234 a 1,234,567 sentence 7,654.321.";
// NOTE: value "7,654.321" would preferably NOT match,
// but it is acceptable for now if it does
MatchCollection matches = Regex.Matches(text, PATTERN);
foreach (Match match in matches)
Console.Write(match.Value + " ");
I CANNOT call Regex.Matches
and then do a String.Replace
to remove the commas; it all must happen in the regex PATTERN
(because all my regex expressions are being pulled from a database and cannot include logic outside the pattern itself without lots of spaghetti code). As noted, I would prefer not to match rational values, but that should be easy to fix once I get the comma exclusion working.
The following pattern DOES NOT WORK , but it is probably pretty close to what I need:
// THIS PATTEN DOES NOT WORK!!!
// but is probably close to what I need
string PATTERN = @"([\+-]?[0-9]+[(?<=,)[0,9]{3}]*)([eE][\+]?[0-9]+)?"
If you remove the [(?<=,)[0,9]{3}]*
from above, you have a standard integer pattern. Once again, I need to accept commas in the integer, but not return them as part of the match. How should I change this pattern?
A regex match is a whole substring of the input string. It can't be a set of substrings - it has to be one substring.
Similarly, the capturing groups can only capture substrings so you can't do much about this either.
But since you're using .NET you could try a really ugly hack by leveraging the capture stack, if you can afford to add some general-purpose code.
First, the regex. It is simplified to the minimum just so it's easier to understand:
(?:(?<concat>\d+),?)+
A full version of the regex is provided below, but for now we'll stick with that one.
Then, in your code you could implement the following logic:
concat
, then process as usual match.Groups["concat"].Captures
This would be similar to this:
public static IEnumerable<string> GetValues(string input)
{
// Suppose regex could be any regex
var regex = new Regex(@"(?:(?<concat>\d+),?)+");
foreach (Match match in regex.Matches(input))
{
// Does this regex have our special feature?
if (regex.GroupNumberFromName("concat") >= 0)
{
// Concat the captured values
var captures = match.Groups["concat"].Captures.Cast<Capture>().Select(c => c.Value).ToArray();
yield return String.Concat(captures);
}
else
{
// This is a normal regex
yield return match.Value;
}
}
}
Ok, this is a hack, but it would let you keep the logic in a declarative and reusable way in the regex.
Now the full regex you posted would look something like this in its hacked version:
(?<concat>[-+])?(?<concat>[0-9]+)(?:,(?<concat>[0-9]{3}))*(?<concat>[eE][-+]?[0-9]+)?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.