简体   繁体   English

关系数据库和关系表的规范化

[英]Relational Database and Normalization for Relational Tables

Im trying to figure out what the appropriate way to setup a database would be given this scenario: 我试图弄清楚在这种情况下将使用哪种合适的方法来建立数据库:

I am creating a Movie / TV database. 我正在创建一个电影/电视数据库。 A movie may have multiple genres and a tv show may have multiple genres. 电影可能具有多种类型,电视节目可能具有多种类型。

Essentially what I am wondering is if you have a Movie table, TV table... should you: 本质上,我想知道的是,如果您有电影桌,电视桌……您应该:

  1. have a MovieHasGenre table consisting of a foreign key to the Movie table and a regular field for the genre value 有一个MovieHasGenre表,该表由Movie表的外键和类型值的常规字段组成

    or 要么

  2. have a MovieHasGenre table AND a Genre table where the MovieHasGenre has two foreign keys, one pointing to the Movie in the Movie table the other pointing to the Genre in the Genre table 有一个MovieHasGenre表和一个Genre表,其中MovieHasGenre有两个外键,一个指向Movie表中的Movie,另一个指向Genre表中的Genre

Im really not sure if this is something standardized or just involves preference. 我真的不确定这是标准化的东西还是只涉及偏好。 Do we have concerns with speed as it seems removing the Genre table is one less join. 我们是否担心速度,因为似乎删除Genre表的联接少了一点。

Go with option 2. 选择选项2。

It's useful to store each Genre once, and make reference to it via the MoveHasGenre table. 存储每个流派一次并通过MoveHasGenre表对其进行引用很有用。 That way, if you have other attribute columns for a genre, you don't have to store those attribute redundantly on each row where a given genre is mentioned.\\ 这样,如果您有其他类型的属性列,则不必在提及给定类型的每一行上多余地存储这些属性。\\

Re your comment: 发表您的评论:

Another case is if you want to change the spelling of a genre, and have it apply to all rows that reference it, with no chance you forget some. 另一种情况是,如果您想更改流派的拼写,并将其应用于引用该流派的所有行,而没有机会忘记某些流派。

Option 2 is how you will normalize your data. 选项2是如何标准化数据。

The problems with option 1 is data redundancy. 选项1的问题是数据冗余。 Opposed to using a few bytes of data to store and INT you are using a potentially large value to store the name of the genre. 与使用几个字节的数据来存储和INT相对,您使用的是可能较大的值来存储类型的名称。 The other problem like Bill said is that you have the potential for data inconsistency since you will have to update multiple fields if a genre changes instead of just one column. 比尔说的另一个问题是,您可能会出现数据不一致的情况,因为如果体裁发生变化,您将不得不更新多个字段,而不仅仅是一行。

However, what you have in option 1 is a denormalized version of option 2 which would have performance benefits over option 2 but I would imagine, given the seemingly small size of this database, that there won't be a significant performance change. 但是,您在选项1中拥有的是选项2的非规范化版本,它将比选项2具有性能上的优势,但是我可以想象,鉴于该数据库的规模似乎很小,性能不会发生重大变化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM