简体   繁体   中英

mysql - store typically/mostly NULL columns in one table or create 1:1 relationship?

Running Mysql Server version: 5.7.27-0ubuntu0.18.04.1

I'm creating a site/app where a user "submission" can be one of:

  1. Text Comments
  2. picture/file upload
  3. video/file upload (more or less technically same as #2, just with different mime type)

I'm having trouble deciding between the two designs (shortened for brevity)...

CREATE TABLE submissions
(
    submissionID            INT,
    userID                  INT,
    submissionComments      TEXT,
    fileDirectory           VARCHAR2(32), -- starting here these are only used 20% of time
    fileName                VARCHAR2(128)
    fileMimeType            VARCHAR2(128),
    fileSize                INT,
    originalFileName        VARCHAR2(64)
)

-OR-

CREATE TABLE submissions
(
    submissionID            INT,
    userID                  INT,
    submissionComments      TEXT
)

CREATE TABLE submissionFiles
(
    submissionFileID        INT,
    submissionID            INT, -- FK to submissions table
    fileDirectory           VARCHAR2(32),
    fileName                VARCHAR2(128),
    fileMimeType            VARCHAR2(128),
    fileSize                INT,
    originalFileName        VARCHAR2(64)    
)

I'm assuming text comments will prob be 70-80% of submissions.

So, the question becomes, is it better to use a single table and have a bunch of NULL values in fileDirectory/fileName/fileMimeType/fileSize/originalFileName? Or, is it better to have a 1:1 relationship to support when files are uploaded. In that case, I'd be creating both a submissions and submissionFiles record. Obviously most queries would then require joining the two tables.

This essentially comes down to not having a good understanding of the impacts of VARCHAR (and 1 INT) columns in tables where they are majority NULL. I'm probably pre-optimizing a bit here considering this is a brand new site/app, but i'm trying to plan ahead.

Late addition 2nd question (as I type this out), i see that TEXT is capable of handling: 65,535 characters or 64 KB. That seems like a lot for what a typical user would be submitting (probably less than 500 characters). It would eat up storage pretty quick. Would would be the impacts of making submissionComments into VARCHAR(500) instead of TEXT? I'm assuming if anything, there are no negative trade-offs besides being able to store "less".

Thanks!

Edit: as madhur pointed out, there are similar questions/good answers about "design patterns". i'm more concerned about performance. does the presence of large number of varchar's negatively impact data storage/retrieval (by messing up the way mysql implements pages/extents/etc)?

I have built schemas either way. At some level, it does not matter. But you may find that certain queries are faster one way ( or the other way ). The disk usage is about the same.

Your second option allows for (and hence implies) multiple 'files' per 'submission'. For such a "many:1" relationship, you must use 2 tables.

On the other hand, if there can there be only one "file" per "submission", you don't need submissionFileID (which I assume was intended to be the PRIMARY KEY ??) Instead, use PRIMARY KEY(submissionID) for that second table.

If you wish to discus further, please provide the full CREATE TABLE , including NULL or NOT NULL , the PRIMARY KEY of each table, and any secondary indexes.

submissionComments into VARCHAR(500) instead of TEXT?

  • No storage difference.
  • No speed difference.
  • The former would truncate, giving a warning or error, at 500 characters ; the latter would truncate at 65535 bytes. I would simply use TEXT .

Back to the main question. Your example has several columns that are either all NULL or all filled in. Hence, I would lean toward 2 tables.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM