简体   繁体   中英

Enumerate regex match names/values

What is the C# equivalent of this pseudo-code?

var pattern = ...;
var lookup = new Dictionary<string, string>();

foreach (var group in new Regex(pattern).Matches())
{
    lookup[group.Name] = group.Value;
}

I don't see any System.Text.RegularExpressions group-related object that exposes the group name.

What am I missing?

What I'm actually trying to do is convert a file with lines in this format:

eventName|message|date

To an IEnumerable<EventLogLine> , with EventLogLine being:

public struct EventLogLine
{
    public string EventName { get; set; }
    public string Message { get; set; }
    public DateTime Date { get; set; }
}

And put those lines into a IDictionary<string /*EventName*/, IEnumerable<EventLogLine>>.

To more directly answer your original question (without commenting on your approach), as I had a similar problem...

According to Mono source code , the enumeration for the Groups indexer is based on the private Match.regex field, so you'll need to still have the Regex . But if you do, like you had above...

public static Dictionary<string, string> ToDictionary(
    Regex regex, GroupCollection groups)
{
    var groupDict = new Dictionary<string, string>();
    foreach (string name in regex.GetGroupNames()){ //the only way to get the names
        Group namedGroup = groups[name]; //test for existence
        if (namedGroup.Success)
            groupDict.Add(name, namedGroup.Value);
    }
    return groupDict;
}

or, as Linq,

regex.GetGroupNames()
  .Where(name => groups[name].Success)
  .ToDictionary(name => name, name => groups[name].Value)

I just knocked this up in using LINQ. It relies on the List<string> to be filled with the lines in the file.

        var lines = new List<string>();
        var dict = lines.Select(l =>
        {
            var sp = l.Split('|');
            return new EventLogLine { EventName = sp[0], Message = sp[1], Date = DateTime.Parse(sp[2]) };
        })
        .GroupBy(e => e.EventName)
        .ToDictionary(grp => grp.Key, grp => grp.AsEnumerable());

Basically you convert each line to an EventLogLine , using the Select() , then use the GroupBy() to create your grouping based on EventName, then using the ToDictionary() to run the query and create your dictionary in the format required!

See the example in the Match.Groups MSDN article. I think you should look at Alastair's answer though, seeing as your input is so simple it would probably be easier to read the code later if you just use ReadLine and Split.

Consider using ToLookup rather than ToDictionary . Lookups work naturally with linq and generic code in general by being immutable and by exposing aa very simple API. Also, I would encapsulate the parsing into the EventLogLine struct.

As a result, the code would look like this:

IEnumerable<string> lines;

ILookup<string, EventLogLine> lookup = 
    lines.Select(EventLogLine.Parse).ToLookup(evtLine => evtLine.EventName);

An example consumer:

if(lookup["HorribleEvent"].Any())
    Console.WriteLine("OMG, Horrible!");

foreach(var evt in lookup["FixableEvent"])
    FixIt(evt);

var q = from evtName in relevantEventNames
        from evt in lookup[evtName]
        select MyProjection(evt);

Note that you do not need to check for key-existance, unlike for a Dictionary:

if(dictionary.ContainsKey("HorribleEvent")) //&& dictionary["HorribleEvent"].Any() sometimes needed
    Console.WriteLine("OMG, Horrible!");

if(dictionary.ContainsKey("FixableEvent"))
    foreach(var evt in lookup["FixableEvent"])
        FixIt(evt);

var q = from evtName in relevantEventNames.Where(dictionary.ContainsKey)
        from evt in dictionary[evtName]
        select MyProjection(evt);

As you may notice, working with a dictionary containing IEnumerable values introduces subtle friction - ILookup is what you want!

Finally, the modified EventLogLine :

public struct EventLogLine {
    public string EventName { get; private set; }
    public string Message { get; private set; }
    public DateTime Date { get; private set; }

    public static EventLogLine Parse(string line) {
        var splitline = line.Split('|');
        if(splitline.Length != 3) throw new ArgumentException("Invalid event log line");
        return new EventLogLine { 
            EventName = splitline[0],
            Message = splitline[1],
            Date = DateTime.Parse(splitline[2]),
        };
    }
}

To answer this part of your question:

I don't see any System.Text.RegularExpressions group-related object that exposes the group name. What am I missing?

I have adapted Eamon Nerbonne's struct to use regular expressions:

public struct EventLogLine
{
    public string EventName { get; private set; }
    public string Message { get; private set; }
    public DateTime Date { get; private set; }

    private static Regex expectedLineFormat = new Regex(
            @"^(?<eventName>[^|]*)\|(?<message>[^|]*)\|(?<date>[^|]*)$",
            RegexOptions.Singleline | RegexOptions.Compiled
    );

    public static EventLogLine Parse(string line) {

        Match match = expectedLineFormat.Match(line);

        if (match.Success) {
            return new EventLogLine {
                EventName = match.Groups["eventName"].ToString(),
                Message = match.Groups["message"].ToString(),
                Date = DateTime.Parse(match.Groups["date"].ToString()
            };
        }
        else {
            throw new ArgumentException("Invalid event log line");
        }
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM