简体   繁体   English

如何正确构造架构

[英]How to correctly structure schema

I'm enrolled in DBM/BI certificate program (crash course more like) and I decided to embark on an independent project to sort of implement everything i'm learning in real time. 我参加了DBM / BI证书计划(更像是速成班),所以我决定着手进行一个独立的项目,以实时实施我正在学习的所有内容。 Long story short, Ill be analyzing data (boxofficemojo.com) about the top grossing 130 movies from the last 13 years ( using MySQL server/workbench. ). 长话短说,我将使用MySQL服务器/工作台分析有关过去13年中票房最高的130部电影的数据(boxofficemojo.com)。 First i'd like to map out a schema and then do some data mining/visualization. 首先,我想绘制一个模式,然后进行一些数据挖掘/可视化。 Here's how i've split it up so far: 到目前为止,这是我将其拆分的方式:

"Movies"
 Movie_ID (Primary )
 Dom_Revenue
 Int_Revenue
 OpWe_Revenue
 Budget


"Rating"
Rating_ID (P)
Rating

"Release"
Release_ID (P)
Year
Month
Day
Movie_ID (F)

"Cast"
Director_Gender (P)
Lead_Gender (P)
Director_Name
Director_Name
Movie_ID (F)

"Studio"
Studio_ID (P)
Studio_Name

and these are my relationships so far: 到目前为止,这些是我的关系:

rating to movies - one to many ( many movies can be rated R , a movie can only have 1 rating )
release to movies - one to many ( many movies can be released on the same weekend, a movie can only be released once)
cast to movies - one to many (directors/actors can make many movies, a movie can only have one cast)
studio to movies - many to many (movies can be attached to more than one studio, a studio can make more than one movie)

I know the schema is most likely not 100% correct so should i include the primary keys from all the other tables as foreign keys in the "movies" table? 我知道该模式很可能不是100%正确的,因此我应该将所有其他表的主键作为外键包含在“电影”表中吗? and how are my relationships? 以及我的关系如何?

thanks in advance 提前致谢

it looks ok for me. 我觉得还可以。

I just think the "release" entity maybe a little bit overkill (what's the use to know what movies were released at the same time?) so I think it could just be a set of movie attributes. 我只是认为“发行”实体可能有点矫kill过正(同时知道发行了哪些电影有什么用?),所以我认为它可能只是一组电影属性。

And also your "cast" entity has two directors. 您的“投射”实体也有两名董事。 Maybe you could normalize that and keep only 1 director (since movie 1<-->N director, it's just a matter of adding relationships) 也许您可以将其标准化并仅保留1个导演(由于电影1 <-> N个导演,这只是添加关系的问题)

About FKs, yes, you should add them. 关于FK,是的,您应该添加它们。 Your relationships look fine. 您的关系看起来不错。

Good luck. 祝好运。

This is related to the first answer by Leo but I'll be more specific and I add more observations. 这与Leo的第一个答案有关,但我会更加具体,并添加更多观察结果。

First, Release attributes are functionally dependent on Movie_ID (or Movies in general) so it should not be a separate Entity . 首先, Release属性在功能上取决于Movie_ID (或通常的Movies),因此不应将其作为单独的Entity

Second, and in relation to the first, you have Year , Month and Day in your Release Entity why not make it as Release_Date which has Year , Month and Day anyway? 其次,相对于第一个,您在发布实体中具有YearMonthDay ,为什么不将其设置为Release_Date却仍然具有YearMonthDay呢? Then you could make again your Release attributes as part of your Movie . 然后,您可以再次将Release属性作为Movie一部分。

Third, and in relation to the first why not add a Movie_Title field? 第三,关于第一个,为什么不添加Movie_Title字段?

So, in all-in-all then you could have the following schema: 因此,总的来说,您可以拥有以下架构:

"Movies"
Movie_ID (Primary )
Movie_Title
Dom_Revenue
Int_Revenue
OpWe_Revenue
Budget
Release_Date

You could easily query movies that are release in a certain Year like: 您可以轻松查询在特定Year发行的电影,例如:

SELECT Movie_Title, Year(Release_Date) as Release_Year
FROM Movies
WHERE Year(Release_Date) = 2011

Or you could count it also by Year (or by Month ) 或者,您也可以按Year (或Month )进行计数

SELECT Year(Release_Date) as Release_Year, COUNT(*) Number_of_Movies_in_a_Year
FROM Movies
GROUP BY Year(Release_Date)
ORDER BY Year(Release_Date)

Fourth, in your Cast entity you said "Directors/Actors can make many movies, a movie can only have one cast". 第四,在您的Cast实体中,您说过“导演/演员可以拍多部电影,一部电影只能有一部演员”。 But looking at your Cast you have a Movie attribute which is a FK (Foreign Key) from Movies and that means by the way that a Movie could have many Cast because the FK is always in the many side. 但是在查看Cast您具有Movie属性,该属性是MoviesFK (外键),这意味着Movie可以有很多Cast因为FK总是在很多方面。 And besides this entity is almost like a violation of the 4NF (Fourth Normal Form). 此外,这个实体几乎就像是违反4NF (第四范式)的行为。 So, the best way probably to do this is to make specialization in your Cast table and relate it to Movies table so that it would have One-to-Many relationship or a Cast or Director could have many movies. 因此,执行此操作的最佳方法是在Cast表中进行专业化处理,并将其与Movies表关联,以使其具有One-to-Many关系,或者CastDirector可以拥有许多电影。 So, it would look like this: 因此,它看起来像这样:

 "Cast"
 Cast_ID (PK)
 Cast_Name
 Cast_Gender
 Cast_Type (values here could either be Director or Lead or could be simply letters like D or L)

And your Movies table could now be changed to like this: 现在,您的Movies表可以更改为:

"Movies"
Movie_ID (Primary )
Movie_Title
Dom_Revenue
Int_Revenue
OpWe_Revenue
Budget
Release_Date
Lead_ID (FK)
Cast_ID (FK)

Lastly, you said "movies can be attached to more than one studio, a studio can make more than one movie". 最后,您说过:“电影可以连接到多个工作室,一个工作室可以制作多部电影”。 A Many-to-many relationship usually has a bridge table to create the many-to-many relationship between entities. Many-to-many关系通常具有一个bridge table用于在实体之间创建many-to-many关系。 So, let's say you have a Studio_Movie entity/table as your bridge table then you will have like this: 因此,假设您有一个Studio_Movie实体/表作为桥表,那么您将像这样:

"Studio_Movie"
Studio_ID (PK, FK1)
Movie_ID (PK, FK2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM