简体   繁体   中英

Consolidating tables with one-to-one relationships

I have 3 MySQL tables for a membership system.

  • users : Minimum requirement to be a user, only related to account info (email, password, is_activated, etc)
  • user_profiles : Personal information provided by the user (name, address, phone...)
  • user_member_profiles : Information strictly managed by admins (registration fee paid, meetings attended, etc.)

These could be condensed into one table, saving me headaches and keeping my code clean - but I feel like it's better to leave them separate as they serve slightly different purposes.

Option 1: Leave it this way and keep doing JOIN s and tedious UPDATE s (this piece of data goes to this table, this piece goes to another, etc.) . More work for me, but maybe it makes more sense?

Option 2: Combine everything into one table.

I would assume using one table would be faster, no need to join tables. Maybe this depends on the data? Each table has about 12-20 fields, so the combined table would be large.

Each user has no more than 1 profile from each table, but may not even have a profile at all (or may have only 1 total).

To add a little context to this: It is for an ever-evolving CMS written in PHP which I will need to make tweaks to the tables for each installation. Admins need to manage members in a speadsheet-like way, so I will be selecting up to 200 users at a time.

What is the correct approach to this from a performance, design, or organization perspective?

Another factor to consider with wide table (many columns) is the effect on the RDBMS cache. Any good developer knows that you don't do 'select * from table' as it will carry unnecessary data over the network from RDBMS to client. But a similar effect can happen betwen disk and RAM and also affect the amount of space in RAM that a table requires to cache.

Most RDBMSes allocate a given volume of memory to cache data, thus reducing physical disk reads and speeding response to the user. This is Buffer Cache in Oracle or SQL Server

If you have a wide table and issue a query in the form 'select col1, col2, col3 from table' the RDBMS will load the full rows into RAM (not col1 through 3). As it does so it will age out older cached data. If your table is wide and you load 50 columns you of course require more RAM than for the same number of rows * a narrow table. This can have a noticeable impact on RDBMS performance.

Lots of wide tables, aging out other tables from cache and it is possible to see the IO stats go thorough the roof as commonly used tables age out of cache to make room for wide tables.

This factor should be added to the other advantages of normalised data and taken into consideration at table design time. In effect if you have a potentially wide table with some data that will be regularly accessed and some that will be rarely, consider multiple tables with a 1 to 1 relationship.

My design urges say keep separate because maybe in the future a user will have two profiles, but performance is likely better if they are merged. If there is truly a one-to-one relationship, and that relationship will never change, then I would merge them.

You don't have to use that many joins for retrieving data.

You can have a VIEW to show for example all columns from users and user_profiles :

CREATE VIEW users2 AS
( SELECT u.id
       , u.email
       , u.password
       , u.is_activated
       , p.name
       , p.address
       , p.phone
  FROM users u
    LEFT JOIN user_profiles p
      ON u.id = p.id
)

and use this VIEW in the queries that need data from both tables. Another VIEW for all 3 tables, etc.

The design issue is whether you need to have multiple records in any of those tables for one user. If so, do not combine them. If the tables are in a one-to-one realtionship, you can combine them, but should not if they have many fields or your record size will be too wide which can cause performance problems as well as making it impossible to add data if you exceed the actual record size limit for a single record. If you currently have a lot of code that access them as serarate tables and a lot of data, restructuring them for the minor gain you would get (saving all of a minute or so in development and probably no time at all inperformance to the users) seems to be a bad idea. YOu could write views so you don't have to do the joins, but honestly these are so simple, I wouldn't bother there either.

There are two reasons to keep tables apart, both to do with how many records you keep about each user.

  • if each person has multiple profiles, keep user and profile data apart; use a column in the profile table (the many side of the relationship) to refer to the primary key of the user table.
  • if each person optionally have a profile (ie has one or none), use two tables in the same way, but to make joins easier, use the same primary key in both tables. The aim is to avoid tables with lots of empty rows. Another way to think of that is that profile inherits from person - and so uses a table of added data with the same key.

Bar those kinds of situations, you want to keep everything in one table, with one key. To express the multiple uses of the data, a good solution is to use views - select a subset of the data and keep it as a view, with a sensible name. When you want, say, administrative data, call up the corresponding view.

Unless you're having strange performance issues, you should just have one table.

By performance issues I'm talking about having so much data that you want to partition it across tables to keep it separate (physical disks, servers, whatever). This clearly isn't the case here. If it were the case then there are lots of better ways to deal with that kind of thing.

The sort of performance issues everyone wishes they had and not many people do...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM