简体   繁体   English

mysql-在一个表中存储通常/通常为NULL列还是创建1:1关系?

[英]mysql - store typically/mostly NULL columns in one table or create 1:1 relationship?

Running Mysql Server version: 5.7.27-0ubuntu0.18.04.1 运行Mysql Server版本:5.7.27-0ubuntu0.18.04.1

I'm creating a site/app where a user "submission" can be one of: 我正在创建一个网站/应用,其中用户“提交”可以是以下之一:

  1. Text Comments 文字评论
  2. picture/file upload 图片/文件上传
  3. video/file upload (more or less technically same as #2, just with different mime type) 视频/文件上传(技术上与#2大致相同,只是MIME类型不同)

I'm having trouble deciding between the two designs (shortened for brevity)... 我在两种设计之间做出选择时遇到了麻烦(为简洁起见,简称)。

CREATE TABLE submissions
(
    submissionID            INT,
    userID                  INT,
    submissionComments      TEXT,
    fileDirectory           VARCHAR2(32), -- starting here these are only used 20% of time
    fileName                VARCHAR2(128)
    fileMimeType            VARCHAR2(128),
    fileSize                INT,
    originalFileName        VARCHAR2(64)
)

-OR- -要么-

CREATE TABLE submissions
(
    submissionID            INT,
    userID                  INT,
    submissionComments      TEXT
)

CREATE TABLE submissionFiles
(
    submissionFileID        INT,
    submissionID            INT, -- FK to submissions table
    fileDirectory           VARCHAR2(32),
    fileName                VARCHAR2(128),
    fileMimeType            VARCHAR2(128),
    fileSize                INT,
    originalFileName        VARCHAR2(64)    
)

I'm assuming text comments will prob be 70-80% of submissions. 我假设文字评论可能占提交的70-80%。

So, the question becomes, is it better to use a single table and have a bunch of NULL values in fileDirectory/fileName/fileMimeType/fileSize/originalFileName? 因此,问题就变成了,使用单个表并在fileDirectory / fileName / fileMimeType / fileSize / originalFileName中具有一堆NULL值是否更好? Or, is it better to have a 1:1 relationship to support when files are uploaded. 或者,上传文件时最好采用1:1关系来支持。 In that case, I'd be creating both a submissions and submissionFiles record. 在这种情况下,我将同时创建一个submitting和submittingFiles记录。 Obviously most queries would then require joining the two tables. 显然,大多数查询将要求将两个表连接在一起。

This essentially comes down to not having a good understanding of the impacts of VARCHAR (and 1 INT) columns in tables where they are majority NULL. 从本质上讲,这归因于他们对多数为NULL的表中的VARCHAR(和1 INT)列的影响没有很好的了解。 I'm probably pre-optimizing a bit here considering this is a brand new site/app, but i'm trying to plan ahead. 考虑到这是一个全新的网站/应用程序,我可能在这里进行了一些优化,但我正在尝试进行规划。

Late addition 2nd question (as I type this out), i see that TEXT is capable of handling: 65,535 characters or 64 KB. 后来添加的第二个问题(在我键入此内容时),我看到TEXT能够处理:65535个字符或64 KB。 That seems like a lot for what a typical user would be submitting (probably less than 500 characters). 对于一般用户要提交的内容来说,这似乎很多(可能少于500个字符)。 It would eat up storage pretty quick. 它将很快耗尽存储空间。 Would would be the impacts of making submissionComments into VARCHAR(500) instead of TEXT? 将SubmitComments放入VARCHAR(500)而不是TEXT会产生影响吗? I'm assuming if anything, there are no negative trade-offs besides being able to store "less". 我假设如果有的话,除了能够存储“更少”之外,没有其他不利的取舍。

Thanks! 谢谢!

Edit: as madhur pointed out, there are similar questions/good answers about "design patterns". 编辑:正如madhur所指出的,关于“设计模式”也有类似的问题/很好的答案。 i'm more concerned about performance. 我更关心性能。 does the presence of large number of varchar's negatively impact data storage/retrieval (by messing up the way mysql implements pages/extents/etc)? 大量varchar的存在是否会对数据存储/检索产生负面影响(通过弄乱mysql实现页面/范围/等的方式)?

I have built schemas either way. 无论哪种方式,我都建立了架构。 At some level, it does not matter. 在某种程度上,这并不重要。 But you may find that certain queries are faster one way ( or the other way ). 但是您可能会发现某些查询以一种方式( 或另一种方式 )更快。 The disk usage is about the same. 磁盘使用情况大致相同。

Your second option allows for (and hence implies) multiple 'files' per 'submission'. 您的第二个选项允许(因此暗示)每个“提交”有多个“文件”。 For such a "many:1" relationship, you must use 2 tables. 对于这种“许多:1”关系,必须使用2个表。

On the other hand, if there can there be only one "file" per "submission", you don't need submissionFileID (which I assume was intended to be the PRIMARY KEY ??) Instead, use PRIMARY KEY(submissionID) for that second table. 另一方面,如果每个“提交”中只能有一个“文件”,则您不需要submissionFileID (我以为是PRIMARY KEY ??),而是为此使用PRIMARY KEY(submissionID)第二张桌子。

If you wish to discus further, please provide the full CREATE TABLE , including NULL or NOT NULL , the PRIMARY KEY of each table, and any secondary indexes. 如果您想进一步讨论,请提供完整的CREATE TABLE ,包括NULLNOT NULL ,每个表的PRIMARY KEY以及任何辅助索引。

submissionComments into VARCHAR(500) instead of TEXT? SubmitComments到VARCHAR(500)而不是TEXT中?

  • No storage difference. 没有存储差异。
  • No speed difference. 没有速度差。
  • The former would truncate, giving a warning or error, at 500 characters ; 前者将截断,并给出500个字符的警告或错误; the latter would truncate at 65535 bytes. 后者将截断为65535字节。 I would simply use TEXT . 我只会使用TEXT

Back to the main question. 回到主要问题。 Your example has several columns that are either all NULL or all filled in. Hence, I would lean toward 2 tables. 您的示例中有几列全为NULL或全部为空。因此,我倾向于使用2个表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM