简体   繁体   English

一个大表还是多个表?

[英]One Large Table or Multiple Tables?

I am trying to build a website that has a similar design to the way Facebook groups work. 我正在尝试建立一个网站,其设计与Facebook小组的工作方式相似。 Users will be able to join groups and then post within those groups. 用户将能够加入组,然后在这些组中发布。 However, I am having trouble creating the database schema in regard to groups and posts. 但是,我在创建关于组和帖子的数据库架构时遇到了麻烦。 This is my table schema thus far: 到目前为止,这是我的表架构:

Table 1: Users
Table 2: Groups
Table 3: Posts

The posts table will create a row every-time a user posts within a group. 每当用户在组中发布时,posts表都会创建一行。 That row within the post table will have the unique ID of the group that post is for as well as the unique ID of the user who created the post. 帖子表中的该行将具有该帖子所属的组的唯一ID,以及创建该帖子的用户的唯一ID。 My worry is that the post table will become massive, especially massive in comparison to the Groups and Users tables. 我担心的是,发布表将变得庞大,尤其是与“组”和“用户”表相比。

considering that there will be many posts (hundreds to thousands) per group, should I create a new table for every group? 考虑到每个小组有很多职位(数百到数千个),我应该为每个小组创建一个新表格吗?

Any and all input on this matter would be greatly appreciated. 对此,任何和所有投入将不胜感激。

In a word, NO. 一言以蔽之。 You should NOT create multiple tables. 您不应该创建多个表。 One groups table is appropriate. 一个组表是合适的。 Index it appropriately it should be fine. 适当索引它应该没问题。 Hundreds or thousands of posts is practically nothing to a database, which is designed to be able to manage millions of rows with proper indexing. 对于数据库来说,成百上千的帖子几乎是什么,它旨在通过适当的索引来管理数百万行。 A column of your table should identify the group ownership, but you should not split it into different tables. 表的一列应标识组的所有权,但您不应将其拆分为其他表。

In the very worst case, you could partition your table when it became unmanageably large to fit in your disk space. 在最坏的情况下,当表太大而无法容纳磁盘空间时,您可以对表进行分区。 However, the likelihood of it growing that large is incredibly small. 但是,它增长到如此之大的可能性非常小。

除非您有成千上万的帖子,或者帖子可能很大,或者您的硬件非常有限,否则只要使用组ID索引,就可以使用单个MySQL表。

除非我们谈到百万条以上的行,否则只要您正确索引表(通过两个ID进行索引)就可以了。

Put in simplistic view, if there is a dependency of any kind between data items then a new table should be created. 简而言之,如果数据项之间存在某种依赖关系,则应创建一个新表。 You can look it up more precisely here: http://en.wiktionary.org/wiki/first_normal_form 您可以在此处更精确地查找它: http : //en.wiktionary.org/wiki/first_normal_form

It does not state though that a new table should be created every time the table becomes too large. 它没有说明每次表太大时都应创建一个新表。 That would be something for a database administrator. 对于数据库管理员来说,那是一件好事。 In your example, the most recent posts will be read more often than those written 5 months ago. 在您的示例中,与5个月前撰写的文章相比,最新文章的阅读频率更高。 For the sake of proper indexing and to avoid duplicates in data rows, you could use a structure like this: 为了正确建立索引并避免在数据行中重复,可以使用如下结构:

在此处输入图片说明

Note) This diagrams says that; 注)此图说明了这一点; i) one user will post to one or more groups, ii) one group will have one or more users, iii) one post will be viewed by one or more users in a group. i)一个用户将发布到一个或多个组,ii)一个组将拥有一个或多个用户,iii)一个帖子将被一组中的一个或多个用户查看。 All 3 relations are one-to-many and the cardinality between users and group is many-to-many. 这三个关系都是一对多的,并且用户和组之间的基数是多对多的。

Moreover, you could "group/structure" your posts, - the table that is likely to grow over time - into years, months or even weeks. 而且,您可以将您的帖子“分组/组织”,该表格可能随着时间的推移而增长,可以分成数年,数月甚至数周。 So then you would be able to say for which time period, whereby you could also make this time factor a date field in your posts table instead of a separate table. 这样一来,您便可以说出在哪个时间段,您还可以将此时间因素设为帖子表中的日期字段,而不是单独的表。

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM