简体   繁体   中英

How can I match regex named capturing groups in order

I have a regex for capturing commands in C# console app. User can write multiple commands in the same line. The problem is that those commands are captured not in the order they were written but in the order that regex named capturing groups are written. For example if user types:

  • QI

Regex below would first catch I and then Q resulting in wrong order of command executions. Is it possible to fix this within regex?

^((?<state>I\s?)?()|(?<quit>Q\s?)?()|(?<time>VR (\d{2}\.\d{2}\.\d{4}\. \d{2}\:\d{2}\:\d{2})\s?)?()|(?<connection_type>V (?:PU|PO|OS) (:?S|Z) (\d{2}\.\d{2}\.\d{4}\. \d{2}\:\d{2}\:\d{2}) (\d{2}\.\d{2}\.\d{4}\. \d{2}\:\d{2}\:\d{2})\s?)?()|(?<request_file>UR [0-9a-zA-Z_-]{1,}\.csv\s?)?()|(?<create_reserved_request>ZD [0-9]{1,}\s?)?()|(?<create_request>ZP [0-9]{1,} [0-9]{1,}\s?)?()|(?<ship_channel>F [0-9]{1,} [0-9]{1,}( [Q])?\s?)?()|(?<table_formating>T(( P)?()|( Z)?()|( RB)?\s?){1,3})?()|(?<occupied_connections_by_type>ZA (\d{2}\.\d{2}\.\d{4}\. \d{2}\:\d{2})\s?)?()|(?<print_data>VF(( R)?()|( B)?()|( M)?()|( K)?()|( D)?\s?){1,8})?)*$

Code:

string regexCommands = @"(^((?<status_vezova>I\s?)?()|(?<prekid_rada>Q\s?)?()|(?<vrijeme>VR (\d{2}\.\d{2}\.\d{4}\. \d{2}\:\d{2}\:\d{2})\s?)?()" +
                @"|(?<vezovi_po_vrsti>V (?:PU|PO|OS) (:?S|Z) (\d{2}\.\d{2}\.\d{4}\. \d{2}\:\d{2}\:\d{2}) (\d{2}\.\d{2}\.\d{4}\. \d{2}\:\d{2}\:\d{2})\s?)?()|" +
                @"(?<datoteka_zahtjeva>UR [0-9a-zA-Z_-]{1,}\.csv\s?)?()|(?<kreiranje_rezerviranog_zahtjev>ZD [0-9]{1,}\s?)?()|" +
                @"(?<kreiranje_zahtjeva>ZP [0-9]{1,} [0-9]{1,}\s?)?()|(?<komunikacija_brod_kanal>F [0-9]{1,} [0-9]{1,}( [Q])?\s?)?()|" +
                @"(?<format_ispisa_tablica>T(( P)?()|( Z)?()|( RB)?\s?){1,3})?()|(?<zauzeti_vezovi_prema_vrsti>ZA (\d{2}\.\d{2}\.\d{4}\. \d{2}\:\d{2})\s?)?()|(?<ispis_podataka>VF(( R)?()|( B)?()|( M)?()|( K)?()|( D)?\s?){1,8})?)*$)";

string commands = Console.ReadLine();
Regex regex = new Regex(regexCommands);
Match match = regex.Match(commands);
if (!regex.IsMatch(naredba!))
{
    throw new Exception();
}

List<KeyValuePair<string, string>> listRegexGroupsAndValues = new List<KeyValuePair<string, string>>();
GroupCollection groups = match.Groups;

//insert all regex named groups and values in list
foreach (string groupName in regex.GetGroupNames())
{
    if (groupName.Length > 2)
        listRegexGroupsAndValues.Add(new KeyValuePair<string, string>(groupName, groups[groupName].Value));
}


//print
foreach (KeyValuePair<string, string> pair in listRegexGroupsAndValues)
{
     Console.WriteLine(pair.Key + " " + pair.Value);
}

This way I have all commands that user has written in one list of key value pairs. Then I can iterate through this list and execute each command by using factory method.

When the Regex returns the matches, you can inspect the Index property in the underlying Capture class to check which position that specific capture/group matched in the original string.

If you are matching against a single stream of commands, if you order the resulting captures/groups by this Index it should give you the results in the order they were typed.

Something like this should work:

foreach (var group in Regex
    .Match(cmdInput, "your_pattern")
    .Groups
    .Values
    .OrderBy(g => g.Index))
{
    // Use the match here
}

This doesn't mean that relying on a Regex is the best approach to parse complex commands like that, but if you want to keep using a single Regex for it, this might work.

You can sort the captures by their Index property. First collect all matches from the relevant groups, then sort by Index of the capture.

const string regEx = @"^((?<state>I\s?)?()|(?<quit>Q\s?)?()|(?<time>VR (\d{2}\.\d{2}\.\d{4}\. \d{2}\:\d{2}\:\d{2})\s?)?()|(?<connection_type>V (?:PU|PO|OS) (:?S|Z) (\d{2}\.\d{2}\.\d{4}\. \d{2}\:\d{2}\:\d{2}) (\d{2}\.\d{2}\.\d{4}\. \d{2}\:\d{2}\:\d{2})\s?)?()|(?<request_file>UR [0-9a-zA-Z_-]{1,}\.csv\s?)?()|(?<create_reserved_request>ZD [0-9]{1,}\s?)?()|(?<create_request>ZP [0-9]{1,} [0-9]{1,}\s?)?()|(?<ship_channel>F [0-9]{1,} [0-9]{1,}( [Q])?\s?)?()|(?<table_formating>T(( P)?()|( Z)?()|( RB)?\s?){1,3})?()|(?<occupied_connections_by_type>ZA (\d{2}\.\d{2}\.\d{4}\. \d{2}\:\d{2})\s?)?()|(?<print_data>VF(( R)?()|( B)?()|( M)?()|( K)?()|( D)?\s?){1,8})?)*$";
const string input = "Q I Q";

List<(string commandGroup, string command, int index)> list = new List<(string commandGroup, string command, int index)>();

// Get first match
Match match = Regex.Match(input, regEx);

// When match then get relevant groups
if (match.Success)
{
    foreach (Group group in match.Groups)
    {
        // that have captures and names which are not numbers
        if (group.Success && !int.TryParse(group.Name, out int ignore))
        {
            // Add all Captures with group name, match value and match index
            foreach(Capture capture in group.Captures)
            {
                list.Add((group.Name, capture.Value, capture.Index));
            }
        }
    }
}
else
{
    // ... throw Exception
}

// Order by Index that is original position in input
list = list.OrderBy(l => l.index).ToList();

foreach ((string commandGroup, string command, int index) in list)
{
    Console.WriteLine(commandGroup + ": " + command);
}

Output is:

 quit: Q state: I quit: Q

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM