简体   繁体   English

数据库设计:具有不同属性的对象

[英]Database design: objects with different attributes

I'm designing a product database where products can have very different attributes depending on their type, but attributes are fixed for each type and types are not manageable at all. 我正在设计一个产品数据库,其中产品可以根据其类型具有非常不同的属性,但是每种类型的属性都是固定的,并且类型根本不可管理。 Eg: 例如:

magazine: title, issue_number, pages, copies, close_date, release_date magazine: title,issue_number,pages,copies,close_date,release_date
web_site: name, bandwidth, hits, date_from, date_to web_site: name,bandwidth,hits,date_from,date_to

I want to use InnoDB and enforce database integrity as much as the engine allows. 我想使用InnoDB并像引擎允许的那样强制执行数据库完整性。 What's the recommended way to handle this? 推荐的方法是什么?

I hate those designs where tables have 100 columns and most of the values are NULL so I thought about something like this: 我讨厌那些表有100列的设计,大多数值都是NULL,所以我想到这样的事情:

product_type
============

product_type_id INT
product_type_name VARCHAR

product
=======

product_id INT
product_name VARCHAR
product_type_id INT -> Foreign key to product_type.product_type_id
valid_since DATETIME
valid_to DATETIME

magazine
========

magazine_id INT
title VARCHAR
product_id INT -> Foreign key to product.product_id
issue_number INT
pages INT
copies INT
close_date DATETIME
release_date DATETIME

web_site
========

web_site_id INT
name VARCHAR
product_id INT -> Foreign key to product.product_id
bandwidth INT
hits INT
date_from DATETIME
date_to DATETIME

This can handle cascaded product deletion but... Well, I'm not fully convinced... 这可以处理级联产品删除但是......好吧,我不完全相信......

This is a classic OO design to relational tables impedance mismatch. 这是关系表阻抗不匹配的经典OO设计。 The table design you've described is known as 'table per subclass'. 您描述的表设计称为“每个子类的表”。 The three most common designs are all compromises compared to what your objects actually look like in your app: 与您的应用中的对象实际外观相比,三种最常见的设计都是妥协:

  1. Table per concrete class 每个具体类的表
  2. Table per hierarchy 每个层次结构的表
  3. Table per subclass 每个子类的表

The design you don't like - "where tables have 100 columns and most of the values are NULL" - is 2. one Table to store the whole specialization hierarchy. 您不喜欢的设计 - “表有100列,大多数值为NULL” - 是一个表来存储整个特化层次结构。 This is the least flexible for all kinds of reasons, including - if your app requires a new sub-class, you need to add columns. 由于各种原因,这是最不灵活的,包括 - 如果您的应用需要新的子类,则需要添加列。 The design you describe accommodates change much better because you can add extend it by adding a new sub-class table described by a value in product_type. 您描述的设计可以更好地适应变化,因为您可以通过添加由product_type中的值描述的新子类表来添加扩展。

The remaining option - 1. Table per concrete class - is usually undesirable because of the duplication involved in implementing all the common fields in each specialization table. 剩下的选项 - 1.每个具体类的表 - 通常是不合需要的,因为在每个专业化表中实现所有公共字段时涉及重复。 Although, the advantages are that you wont need to perform any joins and the sub-class tables can even be on different db instances in a very large system. 虽然,优点是您不需要执行任何连接,并且子类表甚至可以在非常大的系统中的不同数据库实例上。

The design you described is perfectly viable. 您描述的设计非常可行。 The variation below is how it might look if you were using an ORM tool to do your CRUD operations. 以下变体是您使用ORM工具执行CRUD操作时的外观。 Notice how the ID in each sub-class table IS the FK value to the parent table in the hierarchy. 请注意每个子类表中的ID如何是层次结构中父表的FK值。 A good ORM will automatically manage the correct sub-class table CRUD based on the value of the discriminator values in product.id and product.product_type_id alone. 一个好的ORM将根据product.id和product.product_type_id中的鉴别器值的值自动管理正确的子类表CRUD。 Whether you are planning on using an ORM or not, look at hibernate's joined sub-class documentation, if only to see the design decisions they made. 无论您是否计划使用ORM,请查看hibernate的联接子类文档,只是为了查看他们所做的设计决策。

product
=======

id INT
product_name VARCHAR
product_type_id INT -> Foreign key to product_type.product_type_id
valid_since DATETIME
valid_to DATETIME

magazine
========

id INT -> Foreign key to product.product_id
title VARCHAR
..

web_site
========

id INT -> Foreign key to product.product_id INT
name VARCHAR
..

You seem to be roughly on the right track, except that you may need to consider the difference between "a product" and what's often called "a stock-keeping unit" (SKU). 您似乎大致走上正轨,除了您可能需要考虑“产品”与通常称为“库存单位”(SKU)之间的区别。 Is a 25-units box of paper clips (of a certain specific kind) the same "product" as a 50-units box thereof? 25个单位的纸夹(某种特定种类)与50个单位的纸盒相同吗? In terms of a store, or any kind of inventory system, the distinction matters; 在商店或任何类型的库存系统方面,区别很重要; in some cases, indeed, a simple distinction in packaging of what's otherwise the same amount of the same underlying "product" may give you distinct SKUs to keep track of. 实际上,在某些情况下,对包装相同数量的相同基础“产品”的简单区分可能会给您不同的SKU来跟踪。

You need to decide where you want to keep track of this issue, if it matters to your application (it may be OK to have the products laid out as you do, and deal with packaging for SKU purposes in other tables, for example, even though for some apps that might be a slight overhead). 如果对您的应用程序很重要,您需要决定要跟踪此问题的位置(可以将产品布置得像您一样,并在其他表格中处理SKU用途的包装,例如,甚至虽然对于一些可能是轻微开销的应用程序)。

This actually a standard way to "enforce" a sort of OO design in a classical RDBMS. 这实际上是在经典RDBMS中“强制执行”某种OO设计的标准方法。

All the "common" attributes go on the master table (eg Price, if it is mantained at the product table level, could easily be part of the main table) while the specifics go on a subtable. 所有“常用”属性都在主表上(例如,如果它在产品表级别保留,则可以很容易地成为主表的一部分),而具体数据则在子表上。

In theory if you have sub-sub-types (eg magazines could be subtyped in daily newspapers and 4-colours periodicals, maybe, with periodicals having a date interval for shelf-life) you could add one or more sublevels too... 从理论上讲,如果你有子类型(例如杂志可以在日报和4色期刊中分类,也许,期刊有保质期的日期间隔),你也可以添加一个或多个子类...

This is pretty common (and proven) design. 这是非常常见(并且经过验证)的设计。 The only concern is that the master table will always be joined with at least a subtable for most operations. 唯一的问题是主表总是至少与大多数操作的子表连接。 If you have zillions of items this could have performance implications. 如果你有数以万计的商品,这可能会对性能产生影响。

On the other hand, common operation like deleting an item (I'd suggest a logical deletion, setting a flag to "true" on the master table) would be done once for every kind of subtype. 另一方面,对于每种类型的子类型,一般操作如删除项目(我建议逻辑删除,在主表上将标志设置为“true”)将进行一次。

Anyway, go for it. 无论如何,去吧。 And maybe google around for "Object oriented to RDBMS mappings" or somesuch for a complete discussion . 也许谷歌周围的“面向对象到RDBMS映射”或一些完整的讨论

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM