简体   繁体   中英

How can I write a Regex with matching groups for a comma separated string

I've got a random input string to validate and tokenize.

My aim is to check if my string has the following pattern

[a-zA-Z]{2}\d{2} (one or unlimited times) comma separated

So:
aa12,af43,ad46 -> is valid
,aa12,aa44 -> is NOT valid (initial comma)
aa12, -> is NOT valid ( trailing comma)

That's the first part, validation Then, with the same regex I've got to create a group for each occurrence of the pattern (match collection)

So:

aa12,af34,tg53
is valid and must create the following groups
Group 1 -> aa12
Group 2 -> af34
Group 3 -> tg53

Is it possible to have it done with only one regex that validates and creates the groups?

I've written this

    ^([a-zA-Z]{2}\d{2})(?:(?:[,])([a-zA-Z]{2}\d{2})(?:[,])([a-zA-Z]{2}\d{2}))*(?:[,])([a-zA-Z]{2}\d{2})*|$

but even if it creates the groups more or less correctly, it lacks in the validation process, getting also strings that have a wrong pattern.

Any hints would be very very welcome

You can use

var text = "aa12,af43,ad46";
var pattern = @"^(?:([a-zA-Z]{2}\d{2})(?:,\b|$))+$";
var result = Regex.Matches(text, pattern)
        .Cast<Match>()
        .Select(x => x.Groups[1].Captures.Cast<Capture>().Select(m => m.Value))
        .ToList();
foreach (var list in result)
    Console.WriteLine(string.Join("; ", list));
# => aa12; af43; ad46

See the C# demo online and the regex demo .

Regex details

  • ^ - start of string
  • (?:([a-zA-Z]{2}\d{2})(?:,\b|$))+ - one or more occurrences of
    • ([a-zA-Z]{2}\d{2}) - Group 1: two ASCII letter and then two digits
    • (?:,\b|$) - either , followed with a word char or end of string
  • $ - end of string. You may use \z if you want to prevent matching trailing newlines, LF, chars.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM