Need Help With Application Design

Question

So, I'd love some feedback on the best way to design the classes and store the data for the following situation:

I have an interface called Tasks that looks like this:

interface ITask
{
    int ID{ get; set;}
    string Title {get; set;}
    string Description{get; set;}
}

I would like the ability to create different types of Tasks depending on who is using the application...for example:

public class SoftwareTask: ITask
{
    //ITask Implementation
    string BuildVersion {get; set;}
    bool IsBug {get; set;}

}

public class SalesTask: ITask
{
    //ITask Implementation
    int AccountID {get; set;}
    int SalesPersonID {get; set;}
}

So the way I see it I can create a Tasks table in the database with columns that match the ITask interface and a column that shoves all of the properties of more specific tasks in a single column (or maybe even serialize the task object into a single column)

OR

Create a table for each task type to store the properties that are unique to that type.

I really don't like either solution right now. I need to be able to create different types of Tasks ( or any other class) that all share a common core set of properties and methods through a base interface, but have the ability to store their unique properties in a fashion that is easy to search and filter against without having to create a bunch of database tables for each type.

I've starting looking into Plug-In architecture and the strategy pattern, but I don't see where either would address my problem with storing and accessing the data.

Any help or push in the right direction is greatly appreciated!!!

Answer 1

You should probably take a lead from how ORMs deal with this, like TPH/TPC/TPT

Given that ITask is an interface you should probably go for TPC (Table per Concrete Type). When you make it a baseclass, TPT and TPH are also options.

Answer 2

Your second approach (one table per type) is the canonical way to solve this problem - while it requires a bit more effort to implement it fits better with the relational model of most databases and preserves a consistent and cohesive representation of the data. The approach of using one table per concrete type works well, and is compatible with most ORM libraries (like EntityFramework and NHibernate).

There are, however, a couple of alternative approaches sometimes used when the number of subtypes is very large, or subtypes are created on the fly.

Alternative #1: The Key-Value extension table. This is a table with one row per additional field of data you wish to store, a foreign key back to the core table (Task), and a column that specifies what kind of field this is. It's structure is typically something like:

TaskExt Table
=================
TaskID     : Number (foreign key back to Task)
FieldType  : Number or String (this would be AccountID, SalesPersonID, etc)
FieldValue : String  (this would be the value of the associated field)

Alternative #2: The Type-Mapped Extension Table. In this alternative, you create a table with a bunch of nullable columns of different data types (numbers, strings, date/time, etc) with names like DATA01, DATA02, DATA03 ... and so on. For each kind of Task, you select a subset of the columns and map them to particular fields. So, DATA01 may end up being the BuildVersion for a SoftwareTask and an AccountName for a SalesTask. In this approach, you must manage some metadata somewhere that control which column you map specific fields to. A type-mapped table will often look something like:

TaskExt Table
=================
TaskID   : Number  (foreign key back to task)
Data01   : String
Data02   : String
Data03   : String
Data04   : String
Data05   : Number
Data06   : Number
Data07   : Number
Data08   : Number
Data09   : Date
Data10   : Date
Data11   : Date
Data12   : Date
// etc...

The main benefit of option #1 is that you can dynamically add as many different fields as you need, and you can even support a level of backward compatibility. A significant downside, however, is that even simple queries can become challenging because fields of the objects are pivoted into rows in the table. Unpivoting turns out to be an operation that is both complicated and often poorly performing.

The benefits of option #2 is that it's easy to implement, and preserves a 1-to-1 correspondence betweens rows, making queries easy. Unfortunately, there are some downsides to this as well. The first is that the column names are completely uninformative, and you have to refer to some metadata dictionary to understand which columns maps to which field for which type of task. The second downside is that most databases limit the number of columns on a table to a relatively small number (usually 50 - 300 columns). As a result, you can only have so many numeric, string, datetime, etc columns available to use. So if you type ends up having more DateTime fields than the table supports you have to either use string fields to store dates, or create multiple extension tables.

Be forewarned, most ORM libraries do not provide built-in support for either of these modeling patterns.

Need Help With Application Design

Question

2 answers

solution1
3 2010-12-29 21:28:31

solution2
3 ACCPTED 2010-12-29 21:31:13

Need Help With Application Design

Question

2 answers

solution1 3 2010-12-29 21:28:31

solution2 3 ACCPTED 2010-12-29 21:31:13

solution1
3 2010-12-29 21:28:31

solution2
3 ACCPTED 2010-12-29 21:31:13