简体繁体 English

在MySQL中实现继承：替代方案和仅具有代理键的表

[英]Implementing inheritance in MySQL: alternatives and a table with only surrogate keys

原文 2013-09-05 22:18:03 2 1 mysql/ database/ database-design/ scalability/ database-performance

This is a question that has probably been asked before, but I'm having some difficulty to find exactly my case, so I'll explain my situation in search for some feedback: 这个问题以前可能已经提出过，但是我很难找到确切的案例，因此我将在寻求反馈的过程中解释我的情况：

I have an application that will be registering locations, I have several types of locations, each location type has a different set of attributes, but I need to associate notes to locations regardless of their type and also other types of content (mostly multimedia entries and comments) to said notes. 我有一个要注册位置的应用程序，我有几种类型的位置，每种位置类型都有一组不同的属性，但是我需要将注释与位置相关联，而不论它们的类型以及其他内容类型（主要是多媒体条目和注释）。 With this in mind, I came up with a couple of solutions: 考虑到这一点，我提出了一些解决方案：

Create a table for each location type, and a "notes" table for every location table with a foreign key, this is pretty troublesome because I would have to create a multimedia and comments table for every comments table, eg: 为每种位置类型创建一个表，并为每个具有外键的位置表创建一个“注释”表，这非常麻烦，因为我必须为每个注释表创建一个多媒体和注释表，例如：
- LocationTypeA LocationTypeA
  - ID ID
  - Attr1 ATTR1
  - Attr2 ATTR2
- LocationTypeA_Notes LocationTypeA_Notes
  - ID ID
  - Attr1 ATTR1
  - ... ...
  - LocationTypeA_fk LocationTypeA_fk
- LocationTypeA_Notes_Multimedia LocationTypeA_Notes_Multimedia
  - ID ID
  - Attr1 ATTR1
  - ... ...
  - LocationTypeA_Notes_fk LocationTypeA_Notes_fk
And so on, this would be quite annoying to do, but after it's done, developing on this structure should not be so troublesome. 依此类推，这样做会很烦人，但是完成之后，在此结构上进行开发就不会那么麻烦。
Create a table with a unique identifier for the location and point content there, like so: 创建一个具有唯一标识符的表，并在其中指向内容，如下所示：
- Location 位置
  - ID ID
- LocationTypeA LocationTypeA
  - ID ID
  - Attr1 ATTR1
  - Attr2 ATTR2
  - Location_fk Location_fk
- Notes 笔记
  - ID ID
  - Attr1 ATTR1
  - ... ...
  - Location_fk Location_fk
- Multimedia 多媒体
  - ID ID
  - Attr1 ATTR1
  - ... ...
  - Notes_fk Notes_fk
As you see, this is far more simple and also easier to develop, but I just don't like the looks of that table with only IDs (yeah, that's truly the only objection I have to this, it's the option I like the most, to be honest). 如您所见，这要简单得多，而且开发起来也容易，但我只是不喜欢仅包含ID的表的外观（是的，这确实是我唯一的反对意见，这是我最喜欢的选项，老实说）。
Similar to option 2, but I would have an enormous table of attributes shaped like this: 与选项2相似，但是我会有一个巨大的属性表，其形状如下：
- Location 位置
  - ID ID
  - Type 类型
- Attribute 属性
  - Name 名称
  - Value 值
And so on, or a table for each attribute; 依此类推，或者每个属性都有一个表格； a la Drupal. 德拉普拉（la Drupal）。 This would be a pain to develop because then it would take several insert/update operations to do something on a location and the Attribute table would be several times bigger than the location table (or end up with an enormous amount of attribute tables); 开发将是一件痛苦的事情，因为这将需要多个插入/更新操作才能在某个位置上执行某些操作，并且属性表将比位置表大几倍（或者最终会产生大量的属性表）； it also has the same issue of the surrogate-keys-only table (just it has a "type" now, which I would use to define the behavior of the location programmatically), but it's a pretty solution. 它也具有仅代理键表的问题（只是现在有了一个“类型”，我将使用它来以编程方式定义位置的行为），但这是一个很好的解决方案。

So, to the question: which would be a better solution performance and scalability-wise?, which would you go with or which alternatives would you propose? 那么，问题来了：哪种方法在性能和可伸缩性方面会更好呢？您会选择哪种方案？ I don't have a problem implementing any of these, options 2 and 3 would be an interesting development, I've never done something like that, but I don't want to go with an option that will collapse on itself when the content grows a bit; 我没有实现这些问题，选项2和3会是一个有趣的发展，我从未做过类似的事情，但是我不想使用一个选项，当内容出现时它会自行折叠增长一点； you're probably thinking "why not just use Drupal if you know it works like you expect it to?", and I'm thinking "you obviously don't know how difficult it is to use Drupal, either that or you're an expert, which I'm most definitely not". 您可能在想“如果您知道它像您期望的那样工作，为什么不使用Drupal？”，而我在想“您显然不知道使用Drupal有多么困难，无论是那样还是您专家，我绝对不是。”

Also, now that I've written all of this, do you think option 2 is a good idea overall?, do you know of a better way to group entities / simulate inheritance? 另外，既然我已经编写了所有这些内容，那么您认为选项2总体上是一个好主意吗？您是否知道对实体进行分组/模拟继承的更好方法？ (please, don't say "just use inheritance!", I'm restricted to using MySQL). （请不要说“只使用继承！”，我仅限于使用MySQL）。

Thanks for your feedback, I'm sorry if I wrote too much and meant too little. 感谢您的反馈，对不起，我写的太多了，也太少了。

1 个解决方案

ORM systems usually use the following, mostly the same solutions as you listed there: ORM系统通常使用以下解决方案，与您在此处列出的解决方案基本相同：

One table per hierarchy 每个层次结构一张表

Pros: 优点：

Simple approach. 简单的方法。
Easy to add new classes, you just need to add new columns for the additional data. 易于添加新类，您只需添加新列即可获得其他数据。
Supports polymorphism by simply changing the type of the row. 只需更改行的类型即可支持多态。
Data access is fast because the data is in one table. 由于数据在一个表中，因此数据访问速度很快。
Ad-hoc reporting is very easy because all of the data is found in one table. 临时报告非常容易，因为所有数据都在一个表中找到。

Cons: 缺点：

Coupling within the class hierarchy is increased because all classes are directly coupled to the same table. 由于所有类都直接耦合到同一表，因此增加了类层次结构内的耦合。
A change in one class can affect the table which can then affect the other classes in the hierarchy. 一个类的更改可能会影响表，进而影响层次结构中的其他类。
Space potentially wasted in the database. 数据库中可能浪费了空间。
Indicating the type becomes complex when significant overlap between types exists. 当类型之间存在大量重叠时，指示类型变得复杂。
Table can grow quickly for large hierarchies. 对于大型层次结构，表可以快速增长。

When to use: 何时使用：

This is a good strategy for simple and/or shallow class hierarchies where there is little or no overlap between the types within the hierarchy. 对于简单和/或浅类层次结构，在层次结构中的类型之间几乎没有重叠或没有重叠的情况下，这是一个好策略。

One table per concrete class 每个具体班级一张桌子

Pros: 优点：

Easy to do ad-hoc reporting as all the data you need about a single class is stored in only one table. 只需一个表即可轻松完成临时报告，因为您需要将有关一个类的所有数据存储在一个表中。
Good performance to access a single object's data. 访问单个对象的数据的良好性能。

Cons: 缺点：

When you modify a class you need to modify its table and the table of any of its subclasses. 修改类时，需要修改其表及其任何子类的表。 For example if you were to add height and weight to the Person class you would need to add columns to the Customer, Employee, and Executive tables. 例如，如果要向Person类添加身高和体重，则需要在Customer，Employee和Executive表中添加列。
Whenever an object changes its role, perhaps you hire one of your customers, you need to copy the data into the appropriate table and assign it a new POID value (or perhaps you could reuse the existing POID value). 每当对象改变角色时，也许您雇用了一位客户，就需要将数据复制到适当的表中并为其分配新的POID值（或者也许可以重用现有的POID值）。
It is difficult to support multiple roles and still maintain data integrity. 支持多个角色并且仍然保持数据完整性很困难。 For example, where would you store the name of someone who is both a customer and an employee? 例如，您将在哪里存储既是客户又是员工的人的名字？

When to use: 何时使用：

When changing types and/or overlap between types is rare. 更改类型和/或类型之间的重叠很少见。

One table per class 每班一张桌子

Pros: 优点：

Easy to understand because of the one-to-one mapping. 由于一对一映射，易于理解。
Supports polymorphism very well as you merely have records in the appropriate tables for each type. 很好地支持多态，因为您仅在每种类型的相应表中都有记录。
Very easy to modify superclasses and add new subclasses as you merely need to modify/add one table. 修改超类和添加新的子类非常容易，因为您只需要修改/添加一个表。
Data size grows in direct proportion to growth in the number of objects. 数据大小与对象数量的增长成正比。

Cons: 缺点：

There are many tables in the database, one for every class (plus tables to maintain relationships). 数据库中有很多表，每个类一个表（还有用于维护关系的表）。
Potentially takes longer to read and write data using this technique because you need to access multiple tables. 使用此技术可能会花费更长的时间来读写数据，因为您需要访问多个表。 This problem can be alleviated if you organize your database intelligently by putting each table within a class hierarchy on different physical disk-drive platters (this assumes that the disk-drive heads all operate independently). 如果您通过将每个表放在不同物理磁盘驱动器盘上的类层次结构中来智能地组织数据库，则可以缓解此问题（假定磁盘驱动器磁头全部独立运行）。
Ad-hoc reporting on your database is difficult, unless you add views to simulate the desired tables. 除非您添加视图来模拟所需的表，否则很难对数据库进行临时报告。

When to use: 何时使用：

When there is significant overlap between types or when changing types is common. 类型之间有大量重叠或更改类型很常见。

Generic Schema 通用模式

Pros: 优点：

Works very well when database access is encapsulated by a robust persistence framework. 当数据库访问由强大的持久性框架封装时，效果很好。
It can be extended to provide meta data to support a wide range of mappings, including relationship mappings. 它可以扩展为提供元数据，以支持各种映射，包括关系映射。 In short, it is the start at a mapping meta data engine. 简而言之，这是映射元数据引擎的起点。
It is incredibly flexible, enabling you to quickly change the way that you store objects because you merely need to update the meta data stored in the Class, Inheritance, Attribute, and AttributeType tables accordingly. 它非常灵活，可以使您快速更改对象的存储方式，因为您只需要相应地更新存储在Class，Inheritance，Attribute和AttributeType表中的元数据。

Cons: 缺点：

Very advanced technique that can be difficult to implement at first. 非常先进的技术，一开始可能很难实现。
It only works for small amounts of data because you need to access many database rows to build a single object. 它仅适用于少量数据，因为您需要访问许多数据库行以构建单个对象。
You will likely want to build a small administration application to maintain the meta data. 您可能会希望构建一个小型管理应用程序来维护元数据。
Reporting against this data can be very difficult due to the need to access several rows to obtain the data for a single object. 由于需要访问几行以获取单个对象的数据，因此很难根据此数据进行报告。

When to use: 何时使用：

For complex applications that work with small amounts of data, or for applications where you data access isn't very common or you can pre-load data into caches. 对于处理少量数据的复杂应用程序，或者对数据访问不是很普遍的应用程序，或者您可以将数据预加载到缓存中。