I've got a random input string to validate and tokenize.
My aim is to check if my string has the following pattern
[a-zA-Z]{2}\d{2} (one or unlimited times) comma separated
So:
aa12,af43,ad46 -> is valid
,aa12,aa44 -> is NOT valid (initial comma)
aa12, -> is NOT valid ( trailing comma)
That's the first part, validation Then, with the same regex I've got to create a group for each occurrence of the pattern (match collection)
So:
aa12,af34,tg53
is valid and must create the following groups
Group 1 -> aa12
Group 2 -> af34
Group 3 -> tg53
Is it possible to have it done with only one regex that validates and creates the groups?
I've written this
^([a-zA-Z]{2}\d{2})(?:(?:[,])([a-zA-Z]{2}\d{2})(?:[,])([a-zA-Z]{2}\d{2}))*(?:[,])([a-zA-Z]{2}\d{2})*|$
but even if it creates the groups more or less correctly, it lacks in the validation process, getting also strings that have a wrong pattern.
Any hints would be very very welcome
You can use
var text = "aa12,af43,ad46";
var pattern = @"^(?:([a-zA-Z]{2}\d{2})(?:,\b|$))+$";
var result = Regex.Matches(text, pattern)
.Cast<Match>()
.Select(x => x.Groups[1].Captures.Cast<Capture>().Select(m => m.Value))
.ToList();
foreach (var list in result)
Console.WriteLine(string.Join("; ", list));
# => aa12; af43; ad46
See the C# demo online and the regex demo .
Regex details
^
- start of string (?:([a-zA-Z]{2}\d{2})(?:,\b|$))+
- one or more occurrences of
([a-zA-Z]{2}\d{2})
- Group 1: two ASCII letter and then two digits (?:,\b|$)
- either ,
followed with a word char or end of string $
- end of string. You may use \z
if you want to prevent matching trailing newlines, LF, chars.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.