简体   繁体   English

数据库架构-创建一个大表还是多个表?

[英]Database Schema - Create one large table or many?

I'm trying to decide on the best way to store entities which have very similar properties. 我正在尝试确定存储具有非常相似属性的实体的最佳方法。 The main difference is that each entity references other entities. 主要区别在于每个实体都引用其他实体。 I was going to setup the database as: 我打算将数据库设置为:

entity_a (1-1,000 Records) [Data rarely changes]
id|created|updated|entity_b_id|category_id|name|entity_b_id|entity_c_id|entity_d_id

entity_b (10,000-1,000,000 Records) [Data changes constantly]
id|created|updated|entity_b_id|category_id|name|entity_c_id|entity_e_id|entity_f_id

entity_c (10,000-10,000,000 Records) [Data changes constantly]
id|created|updated|entity_b_id|category_id|name|entity_a_id|entity_f_id

entity_d (0-1,000 Records) [Data rarely changes]
id|created|updated|entity_b_id|category_id|name

entity_e (1-100 Records) [Data rarely changes]
id|created|updated|entity_b_id|category_id|name|entity_a_id|entity_b_id

entity_f (0-50,000 Records) [Data frequently changes]
id|created|updated|entity_b_id|category_id|name|entity_c

entity_g (10-100 Records) [Data rarely changes]
id|created|updated|entity_b_id|category_id|name

entity_h (10-1,000 Records) [Data rarely changes]
id|created|updated|entity_b_id|category_id|name|entity_e_id

entity_i (1-10 Records) [Data rarely changes]
id|created|updated|entity_b_id|category_id|name

But it's been suggested that it would be easier to manage one large table as: 但有人建议将一个大表管理为:

ent (20,000-11,000,000 Records)
id|created|updated|ent_id(b)|category_id|name|ent_id(a)|ent_id(b)|ent_id(c)|ent_id(d)|ent_id(e)|ent_id(f)

A concern with this second method is the table size as the id's will be int(11) and there will be six columns of these id's which will mainly be just set as 0. 第二种方法的一个问题是表的大小,因为ID的值将为int(11),并且这些ID的六列将主要设置为0。

But my main concern is speed of access as the records will be accessed very frequently by many users at once. 但是我主要关心的是访问速度,因为许多用户会一次非常频繁地访问记录。 I'm using CodeIgniter and hope to use it's caching abilities to take as much load of the database as possible but that will be limited as some of the data will change second to second. 我正在使用CodeIgniter,希望使用它的缓存功能来承担尽可能多的数据库负载,但是由于某些数据会每秒变化,因此这将受到限制。

Any help would be most appreciated. 非常感激任何的帮助。

I think it's hard to anticipate the actual performance of one vs the other since it's dependent on so many things. 我认为很难预测一个人与另一个人的实际表现,因为它取决于很多事情。

Several considerations: 几个注意事项:

How important is the difference between entities? 实体之间的区别有多重要? If you find yourself often selecting just one type of entity per query then the normalized solution is likely faster. 如果您发现自己每个查询经常只选择一种类型的实体,那么标准化的解决方案可能会更快。

If you have queries that select on other things than the shared columns, like: entity_a with entity_c IN(something) you'll want an index on the entity_c column. 如果您有查询选择了共享列以外的其他内容,例如: entity_a with entity_c IN(something)需要在entity_c列上有一个索引。

entity_c is very large. entity_c非常大。 If it gets updated a lot, and queried very rarely then that's a cause for concern if you're going for the de-normalized version. 如果要进行大量更新,并且很少查询,那么如果要使用非规范化版本,那就值得关注。

If you're doing a lot of JOINs I'm pretty sure the normalized form is faster. 如果您要执行很多JOIN,我很确定标准化表格会更快。

My advice would be: use the normalized form. 我的建议是:使用规范化形式。 If you're seeing performance issues you could look at this solution. 如果您发现性能问题,可以查看此解决方案。

You could also go for a hybrid solution. 您也可以寻求混合解决方案。 Since b and c change often, and the others don't: make two tables like that. 由于b和c经常更改,而其他则不更改:像这样创建两个表。 Or give b and c it's own table but keep the others in one. 或给b和c它自己的桌子,但其他人合而为一。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM