简体   繁体   English

面向对象的程序员如何能够掌控数据库驱动的编程呢?

[英]How can an object-oriented programmer get his/her head around database-driven programming?

I have been programming in C# and Java for a little over a year and have a decent grasp of object oriented programming, but my new side project requires a database-driven model. 我已经用C#和Java编程了一年多一点,对面向对象编程有很好的把握,但是我的新方项目需要一个数据库驱动的模型。 I'm using C# and Linq which seems to be a very powerful tool but I'm having trouble with designing a database around my object oriented approach. 我正在使用C#和Linq,这似乎是一个非常强大的工具,但我在设计面向对象方法的数据库时遇到了麻烦。

My two main question are: 我的两个主要问题是:

How do I deal with inheritance in my database? 我如何处理数据库中的继承? Let's say I'm building a staff rostering application and I have an abstract class, Event. 假设我正在建立一个员工排班应用程序,我有一个抽象类,事件。 From Event I derive abstract classes ShiftEvent and StaffEvent. 从Event I派生出抽象类ShiftEvent和StaffEvent。 I then have concrete classes Shift (derived from ShiftEvent) and StaffTimeOff (derived from StaffEvent). 然后我有具体的类Shift(派生自ShiftEvent)和StaffTimeOff(派生自StaffEvent)。 There are other derived classes, but for the sake of argument these are enough. 还有其他派生类,但为了论证,这些就足够了。

Should I have a separate table for ShiftEvents and StaffEvents? 我是否应该为ShiftEvents和StaffEvents提供单独的表? Maybe I should have separate tables for each concrete class? 也许我应该为每个具体类别都有单独的表格? Both of these approaches seem like they would give me problems when interacting with the database. 这两种方法似乎都会在与数据库交互时给我带来问题。 Another approach could be to have one Event table, and this table would have nullable columns for every type of data in any of my concrete classes. 另一种方法可能是拥有一个Event表,并且该表对于我的任何具体类中的每种类型的数据都有可为空的列。 All of these approaches feel like they could impede extensibility down the road. 所有这些方法都觉得它们可能阻碍可扩展性。 More than likely there is a third approach that I have not considered. 很可能还有第三种方法,我没有考虑过。

My second question: 我的第二个问题:

How do I deal with collections and one-to-many relationships in an object oriented way? 如何以面向对象的方式处理集合和一对多关系?

Let's say I have a Products class and a Categories class. 假设我有一个Products类和一个Categories类。 Each instance of Categories would contain one or more products, but the products themselves should have no knowledge of categories. 每个类别实例都包含一个或多个产品,但产品本身应该不了解类别。 If I want to implement this in a database, then each product would need a category ID which maps to the categories table. 如果我想在数据库中实现它,那么每个产品都需要一个映射到类别表的类别ID。 But this introduces more coupling than I would prefer from an OO point of view. 但是从OO的角度来看,这引入了比我更喜欢的耦合。 The products shouldn't even know that the categories exist, much less have a data field containing a category ID! 产品甚至不应该知道类别存在,更不用说包含类别ID的数据字段! Is there a better way? 有没有更好的办法?

Linq to SQL using a table per class solution: Linq to SQL使用每个类的表解决方案:

http://blogs.microsoft.co.il/blogs/bursteg/archive/2007/10/01/linq-to-sql-inheritance.aspx http://blogs.microsoft.co.il/blogs/bursteg/archive/2007/10/01/linq-to-sql-inheritance.aspx

Other solutions (such as my favorite, LLBLGen) allow other models. 其他解决方案(例如我最喜欢的LLBLGen)允许其他模型。 Personally, I like the single table solution with a discriminator column, but that is probably because we often query across the inheritance hierarchy and thus see it as the normal query, whereas querying a specific type only requires a "where" change. 就个人而言,我喜欢带有鉴别器列的单表解决方案,但这可能是因为我们经常在继承层次结构中进行查询,因此将其视为普通查询,而查询特定类型只需要“where”更改。

All said and done, I personally feel that mapping OO into tables is putting the cart before the horse. 所有的说和做,我个人觉得将OO映射到表格中就是把车放在马前。 There have been continual claims that the impedance mismatch between OO and relations has been solved... and there have been plenty of OO specific databases. 一直有人声称OO和关系之间的阻抗不匹配已经解决了......并且有大量的OO特定数据库。 None of them have unseated the powerful simplicity of the relation. 他们都没有取消这种关系的强大简单性。

Instead, I tend to design the database with the application in mind, map those tables to entities and build from there. 相反,我倾向于在设计数据库时考虑应用程序,将这些表映射到实体并从那里构建。 Some find this as a loss of OO in the design process, but in my mind the data layer shouldn't be talking high enough into your application to be affecting the design of the higher order systems, just because you used a relational model for storage . 有些人认为这是设计过程中OO的损失,但在我看来,数据层不应该高度说话,而是要影响高阶系统的设计,因为你使用关系模型进行存储

I had the opposite problem: how to get my head around OO after years of database design. 我遇到了相反的问题:经过多年的数据库设计,如何让我的头脑清醒。 Come to that, a decade earlier I had the problem of getting my head around SQL after years of "structured" flat-file programming. 谈到这一点,十年前,我遇到了在经过多年“结构化”平面文件编程后逐渐掌控SQL的问题。 There are jsut enough similarities betwwen class and data entity decomposition to mislead you into thinking that they're equivalent. 类和数据实体分解之间有足够的相似之处,误导你认为它们是等价的。 They aren't. 他们不是。

I tend to agree with the view that once you're committed to a relational database for storage then you should design a normalised model and compromise your object model where unavoidable. 我倾向于同意这样的观点:一旦你致力于存储关系数据库,那么你应该设计一个规范化的模型,并在不可避免的情况下妥协你的对象模型。 This is because you're more constrained by the DBMS than you are with your own code - building a compromised data model is more likley to cause you pain. 这是因为您使用DBMS比使用自己的代码更受限制 - 构建受损数据模型更有可能让您感到痛苦。

That said, in the examples given, you have choices: if ShiftEvent and StaffEvent are mostly similar in terms of attributes and are often processed together as Events, then I'd be inclined to implement a single Events table with a type column. 也就是说,在给出的示例中,您有以下选择:如果ShiftEvent和StaffEvent在属性方面大致相似并且通常作为事件一起处理,那么我倾向于使用类型列实现单个Events表。 Single-table views can be an effective way to separate out the sub-classes and on most db platforms are updatable. 单表视图可以是分离子类的有效方法,并且在大多数数据库平台上都是可更新的。 If the classes are more different in terms of attributes, then a table for each might be more appropriate. 如果类在属性方面更加不同,那么每个类的表可能更合适。 I don't think I like the three-table idea:"has one or none" relationships are seldom necessary in relational design. 我不认为我喜欢三桌的想法:“有一个或没有”关系在关系设计中很少需要。 Anyway, you can always create an Event view as the union of the two tables. 无论如何,您始终可以创建一个事件视图作为两个表的并集。

As to Product and Category, if one Category can have many Products, but not vice versa, then the normal relational way to represent this is for the product to contain a category id. 对于产品和类别,如果一个类别可以有许多产品,反之则不然,那么表示这种情况的正常关系方式是产品包含类别ID。 Yes, it's coupling, but it's only data coupling, and it's not a mortal sin. 是的,它是耦合,但它只是数据耦合,而且它不是致命的罪。 The column should probably be indexed, so that it's efficient to retrieve all products for a category. 该列可能应该被编入索引,因此检索类别的所有产品是有效的。 If you're really horrified by the notion then pretend it's a many-to-many relationship and use a separate ProductCategorisation table. 如果你对这个概念感到非常恐惧,那么假装它是一个多对多的关系,并使用一个单独的ProductCategorisation表。 It's not that big a deal, although it implies a potential relationship that doesn't really exist and might mislead somone coming to the app in future. 这并不是什么大不了的事,虽然这意味着潜在的关系并不存在,并且可能会误导未来的应用程序。

In my opinion, these paradigms (the Relational Model and OOP) apply to different domains, making it difficult (and pointless) to try to create a mapping between them. 在我看来,这些范例(关系模型和OOP)适用于不同的域,使得尝试在它们之间创建映射变得困难(并且毫无意义)。

The Relational Model is about representing facts (such as "A is a person"), ie intangible things that have the property of being "unique". 关系模型是关于表示事实 (例如“A是一个人”),即具有“独特”属性的无形事物。 It doesn't make sense to talk about several "instances" of the same fact - there is just the fact. 它没有意义谈论几个同一事实“实例” -有就是事实

Object Oriented Programming is a programming paradigm detailing a way to construct computer programs to fulfill certain criteria (re-use, polymorphism, information hiding...). 面向对象编程是一种编程范例,详细描述了构建计算机程序以满足某些标准(重用,多态,信息隐藏......)的方法。 An object is typically a metaphor for some tangible thing - a car, an engine, a manager or a person etc. Tangible things are not facts - there may be two distinct objects with identical state without them being the same object (hence the difference between equals and == in Java, for example). 一个对象通常是一些有形的东西的隐喻 - 汽车,引擎,经理或人等。有形的东西不是事实 - 可能有两个不同的对象具有相同的状态而没有它们是同一个对象(因此它们之间的区别例如,在Java中等于和==)。

Spring and similar tools provide access to relational data programmatically, so that the facts can be represented by objects in the program. Spring和类似工具以编程方式提供对关系数据的访问,因此事实可以由程序中的对象表示。 This does not mean that OOP and the Relational Model are the same, or should be confused with eachother. 这并不意味着OOP和关系模型是相同的,或者应该与彼此混淆。 Use the Realational Model to design databases (collections of facts) and OOP to design computer programs. 使用Realational Model设计数据库(事实集合)和OOP来设计计算机程序。

TL;DR version (Object-Relational impedance mismatch distilled): TL; DR版本(物体 - 关系阻抗不匹配蒸馏):

Facts = the recipe on your fridge. 事实=冰箱上的食谱。 Objects = the content of your fridge. 物体=冰箱的内容。

I also got to understand database design, SQL, and particularly the data centered world view before tackling the object oriented approach. 在解决面向对象的方法之前,我还要了解数据库设计,SQL,特别是以数据为中心的世界观。 The object-relational-impedance-mismatch still baffles me. 物体 - 关系 - 阻抗 - 不匹配仍然令我感到困惑。

The closest thing I've found to getting a handle on it is this: looking at objects not from an object oriented progamming perspective, or even from an object oriented design perspective but from an object oriented analysis perspective. 我发现最接近它的方法就是:不是从面向对象的程序设计角度来看对象,或者从面向对象的设计角度来看,而是从面向对象的分析角度来看。 The best book on OOA that I got was written in the early 90s by Peter Coad. 我得到的关于OOA的最好的书是在90年代早期由Peter Coad写的。

On the database side, the best model to compare with OOA is not the relational model of data, but the Entity-Relationship (ER) model. 在数据库方面,与OOA进行比较的最佳模型不是数据的关系模型,而是实体 - 关系(ER)模型。 An ER model is not really relational, and it doesn't specify the logical design. ER模型不是真正的关系型,它没有指定逻辑设计。 Many relational apologists think that is ER's weakness, but it is actually its strength. 许多关系辩护者认为这是ER的弱点,但实际上是它的力量。 ER is best used not for database design but for requirements analysis of a database, otherwise known as data analysis. ER最好不用于数据库设计,而是用于数据库的需求分析,也称为数据分析。

ER data analysis and OOA are surprisingly compatible with each other. ER数据分析和OOA令人惊讶地相互兼容。 ER, in turn is fairly compatible with relational data modeling and hence to SQL database design. ER反过来与关系数据建模完全兼容,因此与SQL数据库设计完全兼容。 OOA is, of course, compatible with OOD and hence to OOP. 当然,OOA与OOD兼容,因此与OOP兼容。

This may seem like the long way around. 这看起来似乎还有很长的路要走。 But if you keep things abstract enough, you won't waste too much time on the analysis models, and you'll find it surprisingly easy to overcome the impedance mismatch. 但是如果你保持足够的抽象,你就不会在分析模型上浪费太多时间,而且你会发现克服阻抗不匹配的难易程度。

The biggest thing to get over in terms of learning database design is this: data linkages like the foreign key to primary key linkage you objected to in your question are not horrible at all. 在学习数据库设计方面最重要的事情就是这样:数据链接就像你在问题中反对的主键连接的外键一样,并不可怕。 They are the essence of tying related data together. 它们是将相关数据捆绑在一起的本质。

There is a phenomenon in pre database and pre object oriented systems called the ripple effect. 前数据库和面向对象的系统中存在称为涟漪效应的现象。 The ripple effect is where a seemingly trivial change to a large system ends up causing consequent required changes all over the entire system. 涟漪效应是对大型系统的看似微不足道的变化最终导致整个系统的所需变更。

OOP contains the ripple effect primarily through encapsulation and information hiding. OOP主要通过封装和信息隐藏来包含涟漪效应。

Relational data modeling overcomes the ripple effect primarily through physical data independence and logical data independence. 关系数据建模主要通过物理数据独立性和逻辑数据独立性来克服涟漪效应。

On the surface, these two seem like fundamentally contradictory modes of thinking. 从表面上看,这两者似乎是根本上相互矛盾的思维方式。 Eventually, you'll learn how to use both of them to good advantage. 最后,您将学习如何使用它们以获得良好的优势。

My guess off the top of my head: 我的猜测偏离了我的头脑:

On the topic of inheritance I would suggest having 3 tables: Event, ShiftEvent and StaffEvent. 关于继承的话题,我建议有3个表:Event,ShiftEvent和StaffEvent。 Event has the common data elements kind of like how it was originally defined. 事件的公共数据元素与最初定义的类似。

The last one can go the other way, I think. 我想,最后一个可以走另一条路。 You could have a table with category ID and product ID with no other columns where for a given category ID this returns the products but the product may not need to get the category as part of how it describes itself. 您可以拥有一个包含类别ID和产品ID的表,而不包含其他列,对于给定类别ID,这将返回产品,但产品可能不需要将该类别作为其描述自身的一部分。

The big question: how can you get your head around it? 最大的问题是:你怎么能理解它? It just takes practice. 它只需要练习。 You try implementing a database design, run into problems with your design, you refactor and remember for next time what worked and what didn't. 您尝试实现数据库设计,遇到设计问题,重构并记住下次有效和无效的内容。

To answer your specific questions... this is a little bit of opinion thrown in, as in "how I would do it", not taking into account performance needs and such. 要回答你的具体问题...这是一些观点,如“我将如何做”,不考虑性能需求等。 I always start fully normalized and go from there based on real-world testing: 我总是开始完全标准化,并根据实际测试从那里开始:

Table Event
EventID
Title
StartDateTime
EndDateTime

Table ShiftEvent
ShiftEventID
EventID
ShiftSpecificProperty1

...

Table Product
ProductID
Name

Table Category
CategoryID
Name

Table CategoryProduct
CategoryID
ProductID

Also reiterating what Pierre said - an ORM tool like Hibernate makes dealing with the friction between relational structures and OO structures much nicer. 还重申皮埃尔所说的 - 像Hibernate这样的ORM工具使得处理关系结构和OO结构之间的摩擦变得更好。

There are several possibilities in order to map an inheritance tree to a relational model. 为了将继承树映射到关系模型,有几种可能性。 NHibernate for instance supports the 'table per class hierarchy', table per subclass and table per concrete class strategies: http://www.hibernate.org/hib_docs/nhibernate/html/inheritance.html 例如,NHibernate支持“每个类层次结构的表”,每个子类的表和每个具体类策略的表: http//www.hibernate.org/hib_docs/nhibernate/html/inheritance.html

For your second question: You can create a 1:n relation in your DB, where the Products table has offcourse a foreign key to the Categories table. 对于第二个问题:您可以在数据库中创建1:n关系,其中Products表具有类别表的外键。 However, this does not mean that your Product Class needs to have a reference to the Category instance to which it belongs to. 但是,这并不意味着您的Product Class需要引用它所属的Category实例。 You can create a Category class, which contains a set or list of products, and you can create a product class, which has no notion of the Category to which it belongs. 您可以创建一个Category类,其中包含一组或一组产品,您可以创建一个产品类,该类没有它所属的类别的概念。 Again, you can easy do this using (N)Hibernate; 再次,您可以使用(N)Hibernate轻松完成此操作; http://www.hibernate.org/hib_docs/reference/en/html/collections.html http://www.hibernate.org/hib_docs/reference/en/html/collections.html

听起来你正在发现对象关系阻抗不匹配

The products shouldn't even know that the categories exist, much less have a data field containing a category ID! 产品甚至不应该知道类别存在,更不用说包含类别ID的数据字段!

I disagree here, I would think that instead of supplying a category id you let your orm do it for you. 我不同意这里,我认为不是提供类别ID,而是让你的orm为你做。 Then in code you would have something like (borrowing from NHib's and Castle's ActiveRecord): 然后在代码中你会有类似的东西(借用NHib和Castle的ActiveRecord):

class Category
  [HasMany]
  IList<Product> Products {get;set;}

... ...

class Product
  [BelongsTo]
  Category ParentCategory {get;set;}

Then if you wanted to see what category the product you are in you'd just do something simple like: 然后,如果你想看看你所在的产品类别,你只需做一些简单的事情:

Product.ParentCategory

I think you can setup the orm's differently, but either way for the inheritence question, I ask...why do you care? 我认为你可以设置不同的orm,但无论哪种方式的遗产问题,我问...你为什么关心? Either go about it with objects and forget about the database or do it a different way. 要么用对象去做,要忘记数据库,要么以不同的方式做。 Might seem silly, but unless you really really can't have a bunch of tables, or don't want a single table for some reason, why would you care about the database? 可能看起来很傻,但除非你真的不能拥有一堆表,或者由于某些原因不想要单个表,为什么你会关心数据库呢? For instance, I have the same setup with a few inheriting objects, and I just go about my business. 例如,我有一些继承对象的相同设置,我只是继续我的业务。 I haven't looked at the actual database yet as it doesn't concern me. 我还没有看过实际的数据库,因为它与我无关。 The underlying SQL is what is concerning me, and the correct data coming back. 底层的SQL是关于我的,以及正确的数据。

If you have to care about the database then you're going to need to either modify your objects or come up with a custom way of doing things. 如果您不得不关心数据库,那么您将需要修改对象或提出自定义的处理方式。

I guess a bit of pragmatism would be good here. 我想这里的一些实用主义会很好。 Mappings between objects and tables always have a bit of strangeness here and there. 对象和表之间的映射在这里和那里总是有点奇怪。 Here's what I do: 这是我做的:

I use Ibatis to talk to my database (Java to Oracle). 我使用Ibatis与我的数据库(Java到Oracle)交谈。 Whenever I have an inheretance structure where I want a subclass to be stored in the database, I use a "discriminator". 每当我有一个inheretance结构,我想要一个子类存储在数据库中时,我使用“鉴别器”。 This is a trick where you have one table for all the Classes (Types), and have all fields which you could possibly want to store. 这是一个技巧,你有一个表用于所有类(类型),并拥有你可能想要存储的所有字段。 There is one extra column in the table, containing a string which is used by Ibatis to see which type of object it needs to return. 表中有一个额外的列,包含一个字符串,Ibatis使用该字符串来查看它需要返回的对象类型。

It looks funny in the database, and sometimes can get you into trouble with relations to fields which are not in all Classes, but 80% of the time this is a good solution. 它在数据库中看起来很有趣,有时可能会让你遇到与不在所有类中的字段的关系,但80%的时候这是一个很好的解决方案。

Regarding your relation between category and product, I would add a categoryId column to the product, because that would make life really easy, both SQL wise and Mapping wise. 关于类别和产品之间的关系,我会在产品中添加categoryId列,因为这样可以使生活变得非常简单,无论是SQL智能还是映射。 If you're really stuck on doing the "theoretically correct thing", you can consider an extra table which has only 2 colums, connecting the Categories and their products. 如果你真的坚持做“理论上正确的事情”,你可以考虑一个额外的表,只有2个列,连接类别及其产品。 It will work, but generally this construction is only used when you need many-to-many relations. 它可以工作,但通常这种结构仅在您需要多对多关系时使用。

Try to keep it as simple as possible. 尽量保持简单。 Having a "academic solution" is nice, but generally means a bit of overkill and is harder to refactor because it is too abstract (like hiding the relations between Category and Product). 拥有“学术解决方案”很好,但通常意味着有点矫枉造成并且更难以重构,因为它太抽象(比如隐藏类别和产品之间的关系)。

I hope this helps. 我希望这有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM