简体   繁体   中英

MYSQL and Normalisation: How to handle lots of optional fields?

I have a profile page with about 20 optional fields on it. To keep it normalised would I have to create 20 different tables and then do a query with 20 JOINS in it. That seems a bit over the top to me.

Is that the best way to do it?

Do you recommend that I keep it normalised?

A good way to do this (although a bit confusing unless you know what's going on) is using the same design wordpress uses - as far as I remember it was called entity attribute value (Thanks to @Matt Fenwick). https://stackoverflow.com/tags/eav/info

Basic idea is that, instead of your 20 INNER JOIN -able tables to store odds and ends, you have two tables. One stores your entity (a post in wordpress' case) and the second stores all your odds and ends - or metadata as WP refers to it. Instead of having a column for each data point, you have a column for name, one for value, and one for the ID of the entity that this property applies to.

This way you save yourself a ton of SQL, headaches during extension and time required to build it to begin with. If you ever need to cater for another property, you just bung it in there with the rest - no hacking the schema.

Some more detail on WP's database layout (here I'm thinking mostly of the wp_posts and wp_postmeta tables): http://codex.wordpress.org/Database_Description

So an example might be (pseudocode, sorry):

table: yourEntity
entityID  int, primary key, auto increment
title     varchar

table: yourEntityMeta
entityID  int, non-unique key
name      text
value     text

This way you can have any number of properties for each entity with no restrictions or performance concerns over unused columns with NULL values and 18 more tables that need joining.

Hope this helps

Note: one issue with this (pointed out by @ypercube in the comments) is that using this means you can't specify the data type for each attribute, ie a date attribute would be stored as text, as would a boolean or int. You also wouldn't be able to use a foriegn key to link to a table of valid values (thanks to @Catcall). You would need to consider this carefully before going down this route.

I would just use nullable columns for the optional fields. The table will get quite large, but so many joins will just degrade your performance, and I cannot find a reason why this fields should be normalized, if they belong to one Object and will be updated together.

If the option fields are constants, consider using an ENUM (for 2-20 options), however this method has its own pitfalls.

If your primary concern is database normalisation, you should have the separate 'look-up' tables for each option field, even if you have 20 option fields, so that you don't store duplicate data.

Also, if you decide to change the options in the future it makes your table a lot easier to maintain in the future.

The JOIN statements aren't that bad, and MySQL can support upto 61 tables in one query. I already explored that topic in this question of mine .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM