简体   繁体   中英

C# custom file parsing with 2 delimiters and different record types

I have a (not quite valid) CSV file that contains rows of multiple types. Any record could be one of about 6 different types and each type has a different number of properties. The first part of any row contains the timestamp and the type of record, followed by a standard CSV of the data.

Example

1456057920 PERSON, Ted Danson, 123 Fake Street, 555-123-3214, blah
1476195120 PLACE, Detroit, Michigan, 12345
1440581532 THING, Bucket, Has holes, Not a good bucket

And to make matters more complex, I need to be able to do different things with the records depending on certain criteria. So a PERSON type can be automatically inserted into a DB without user input, but a THING type would be displayed on screen for the user to review and approve before adding to DB and continuing the parse, etc.

Normally, I would use a library like CsvHelper to map the records to a type, but in this case since the types could be different, and the first part uses a space instead of comma, I dont know how to do that with a standard CSV library. So currently how I am doing it each loop is:

  1. String split based off comma.
  2. Split the first array item by the space.
  3. Use a switch statement to determine the type and create the object.
  4. Put that object into a List of type object.
  5. Get confused as to where to go now because i now have a list of various types and will have to use yet another switch or if to determine the next parts.

I don't really know for sure if I will actually need that List but I have a feeling the user will want the ability to manually flip through records in the file.

By this point, this is starting to make for very long, confusing code, and my gut feeling tells me there has to be a cleaner way to do this. I thought maybe using Type.GetType(string) would help simplify the code some, but this seems like it might be terribly inefficient in a loop with 10k+ records and might make things even more confusing. I then thought maybe making some interfaces might help, but I'm not the greatest at using interfaces in this context and I seem to end up in about this same situation.

So what would be a more manageable way to parse this file? Are there any C# parsing libraries out there that would be able to handle something like this?

You can implement an IRecord interface that has a Timestamp property and a Process method (perhaps others as well). Then, implement concrete types for each type of record.

  1. Use a switch statement to determine the type and create and populate the correct concrete type.

  2. Place each object in a List

After that you can do whatever you need. Some examples:

Loop through each item and call Process() to handle it.

Use linq .OfType<{concrete type}> to segment the list. (Warning with 10k records, this would be slow since it would traverse the entire list for each concrete type.)

Use an overridden ToString method to give a single text representation of the IRecord

If using WPF, you can define a datatype template for each concrete type, bind an ItemsControl derivative to a collection of IRecord s and your "detail" display (eg ListItem or separate ContentControl ) will automagically display the item using the correct DataTemplate

Continuing in my comment - well that depends. What u described is actually pretty good for starters, u can of course expand it to a series of factories one for each object type - so that you move from explicit switch into searching for first factory that can parse a line. Might prove useful if u are looking to adding more object types in the future - you just add then another factory for new kind of object. Up to you if these objects should share a common interface. Interface is used generally to define aa behavior, so it doesn't seem so. Maybe you should rather just a Dictionary? You need to ask urself if you actually need strongly typed objects here? Maybe what you need is a simple class with ObjectType property and Dictionary of properties with some helper methods for easy typed properties access like GetBool, GetInt or generic Get?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM