Class linking best practices in C#

Question

First off, EF is not an option for our development environment so please no "just use EF" answers ...

I think this is a pretty standard dilemma so I'm sure there must be a way that most Pros do it that I just have not stumbled across ... so I'm out here hoping y'all can show me what it is.

Let's say you have the following database tables:

tblCompanies 
ID 
NAME

tblDepartments 
ID 
COMPANY_ID 
NAME

tblEmployees
ID 
DEPARTMENT_ID 
FIRSTNAME 
LASTNAME

... what's the best way to represent this in Classes within your code?

I assume the best way is like this:

public class Company
{
     public int ID { get; set; }
     public string Name { get; set; }
     public List<Department> Departments { get; set; }
}

public class Department
{
     public int ID { get; set; }
     public string Name { get; set; }
     public List<Employee> Employees { get; set; }
}

public class Employee
{
     public int ID { get; set; }
     public string FirstName { get; set;}
     public string LastName { get; set; }
}

I believe that to the be the "OOP Proper approach" to this. However, what seems to always happens is something like this:

public class Department
{
     public int ID { get; set; }
     public string Name { get; set; }
     public int CompanyID { get; set; }
     public List<Employee> Employees { get; set; }
}

... mainly because when you pull just a Department from the database you are only going to have Company ID, not all the other attributes needed to fully populated an instance of the Company class.

(I've used a pretty vanilla example here but the one I'm actually tackling in my current project has 3 fields that it uses to link the data together so the thought of having the same 3 fields in several classes seems wrong to me)

Is there a Best Practice for these scenarios? As much as I don't like the thought of storing the same data in multiple classes just out of laziness, I also don't like returning an instance of a class with just one of its fields populated because that's all I had at the time.

Answer 1

This is a common problem, and one that ORMs try to solve. To be sure it isn't an easy one depending on what your wants are and what your constraints are.

There are only two fundamental options to keep one copy of the information. Lazily load the data as requested or load it all to begin with (Greedy load). Otherwise you have to duplicate the data.

With lazy loading you basically set things up such that when navigating into a property you make a call to the database and grab the information needed to load the entity representing the property you are accessing. The tricky part to watch with this is the SELECT N + 1 problem. You experience this problem when you end up iterating a set of parent entities and trigger lazy loads on every child entity, thus resulting in N+1 calls to the database to load a set of entities (1) and their children (N).

Greedy loading basically says load everything you need to start with. ORMs (where they work) are nice because they take care of many of the details via LINQ and create solutions that can be performant and maintainable usually along with the ability of allowing you to manipulate the usage of Greedy and Lazy Loading.

Another important gotcha is many to many relationships. You need to make sure not to have circular initialization, and get all the baggage of circular dependencies. There are surely many more I have missed.

In my humble opinion I am not so sure there is a best practice as much as there are practices with some of them bad - nothing is perfect. You can:

Start rolling your own object relational mapper allowing you to get rid of the duplicate ID
Use a lighter ORM framework to handle some of this allowing you to get rid of the duplicate ID
Create specialized queries to load aggregations of data allowing you to get rid of the duplicate ID (* cough * DDD)
Just keep the duplication of the ID like you mention above and not worry about creating an explicit relational model in your domain.

This one is on you to choose what is best based on your constraints. This is a deep topic and my experience is limited... so take what I am saying with alot of salt .

Answer 2

I don't think there's a "best practices" manual for this kind of things, and surely it depends on how your classes are going to be used. But in my personal experience, I have ended up following this approach:

public class Company
{
   public int ID { get; set; }
   public string Name { get; set; }

   public IEnumerable<Department> GetDepartments()
   {
      // Get departments here
   }
}

public class Department
{
   public int ID { get; set; }
   public string Name { get; set; }
   protected int CompanyID { get; set; }

   private Company _Company;
   public Company Company
   {
      get
      {
         // Get company here
      } 
   }

   public IEnumberable<Employee> GetEmployees()
   {
      // Get employees here
   }
}

public class Employee
{
   public int ID { get; set; }
   public string Name { get; set; }
   protected int DepartmentID { get; set; }

   private Department _Department;
   public Department Department
   {
      get
      {
         // Get department here
      } 
   }

   public IEnumberable<Employee> GetEmployees()
   {
      // Get employees here
   }
}

In some cases I have exposed some of the "navigation" properties of my classes as public (like CompanyID and DepartmentID) to prevent the instantiation of a new class to get a value that has been loaded already.

As others have noted, you could also simulate "lazy loading", but this will require some extra effort from your part.

Answer 3

I would think it depends on requirements. Do you need to traverse upward (get company from department, department from employee, etc). If you do, then it is best that you provide a means of doing that. Ideally that would be something like a Company or Department property, of course you wouldn't want to get data you don't really need, so you'd likely keep a private company id and have a public getCompany function which queries for the data.

Answer 4

I believe that this is not a really OOP question, in your case you just have an database model (database representation in classes) which does not contain any logic and all the classes are used as structs, and this is a right way to map your database to classes - structs. So in your next module which will represent the logic of your program you have to map your database module to the real classes which will contain the logic (I mean methods which will implement it) of course if you really need them. So in my opinion the OO question should be in the logic part of your application. On the other hand you could take a look on nhibernate and how the mapping done in there it will give you a hint for the bes database model implementation.

Answer 5

I believe this is what your classes would look like in NHibernate:

public class Company
{
     public int ID { get; set; }
     public string Name { get; set; }
     public IList<Department> Departments { get; set; }
}

public class Department
{
     public int ID { get; set; }
     public string Name { get; set; }
     public Company Company { get; set; }
     public IList<Employee> Employees { get; set; }
}

public class Employee
{
     public int ID { get; set; }
     public string FirstName { get; set;}
     public string LastName { get; set; }
     public Department Department { get; set; }
}

Note that there is a way to navigate from Employee to Department and from Department to Company (in addition to what you already specified).

NHibernate has all kinds of features to make that just work. And it works very, very well. The main trick is run-time proxy objects to allow for lazy loading. Also, NHibernate supports a lot of different ways to eager and lazy load just exactly how you want to do it.

Sure, you can get these same features without NHibernate or a similar ORM, but why wouldn't use just use a feature rich mainstream techology instead of hand coding your own feature poor custom ORM?

Answer 6

There is another option. Create a 'DataController' class which handles the loading and 'memoization' of your objects. The dataController maintains a dictionary of [CompanyIDs, Company objects] and [DepartmentIDs, Department objects]. When you load a new Department or Company, you keep a record in this DataController dictionary. Then when you instantiate a new Department or Employee you can either directly set the references to the parent objects OR you can use a Lazy[Company/Department] object and set it using a lambda (in the constructor) which will maintain the scope of the DataController without it being referenced directly inside the objects. One thing I forgot to mention, you can also place logic in the getter / get method for the Dictionaries that queries the database if a particular ID is not found. Using all of this together allows your Classes (Models) to be very clean while still being fairly flexible as to when / how their data is loaded.

Class linking best practices in C#

Question

6 answers

solution1
5 2012-02-28 20:36:12

solution2
3 ACCPTED 2012-02-28 20:49:55

solution3
2 2012-02-28 20:39:55

solution4
1 2012-02-28 20:45:53

solution5
0 2012-02-29 11:29:36

solution6
0 2012-03-04 16:20:49

Class linking best practices in C#

Question

6 answers

solution1 5 2012-02-28 20:36:12

solution2 3 ACCPTED 2012-02-28 20:49:55

solution3 2 2012-02-28 20:39:55

solution4 1 2012-02-28 20:45:53

solution5 0 2012-02-29 11:29:36

solution6 0 2012-03-04 16:20:49

solution1
5 2012-02-28 20:36:12

solution2
3 ACCPTED 2012-02-28 20:49:55

solution3
2 2012-02-28 20:39:55

solution4
1 2012-02-28 20:45:53

solution5
0 2012-02-29 11:29:36

solution6
0 2012-03-04 16:20:49