简体   繁体   中英

How to properly organize related tables in MySQL database?

There are two tables - users and orders:

id first_name orders_amount_total
1 Jone 5634200
2 Mike 3982830
id user_id order_amount
1 1 200
2 1 150
3 2 70
4 1 320
5 2 20
6 2 10
7 2 85
8 1 25

The tables are linked by user id. The task is to show for each user the sum of all his orders, there can be thousands of them (orders), maybe tens of thousands, while there can be hundreds and thousands of users simultaneously making a request. There are two options:

  1. With each new order, in addition to writing to the orders table, increase the orders_amount_total counter, and then simply show it to the user.
  2. Remove the orders_amount_total field, and to show the sum of all orders using tables JOIN and use the SUM operator to calculate the sum of all orders of a particular user.

Which option is better to use? Why? Why is the other option bad?

PS I believe that the second option is concise and correct, given that the database is relational, but there are strong doubts about the load on the server, because the sample when calculating the amount is large even for one user, and there are many of them.

Option 2. is the correct one for the vast majority of cases.

Option 1. would cause data redundancy that may lead to inconsistencies. With option 2. you're on the safe side to always get the right values.

Yes, denormalizing tables can improve performance. But that's a last resort and great care needs to be taken. "tens of thousands" of rows isn't a particular large set for an RDMBS. They are built to handle even millions and more pretty well. So you seem to be far away from the last resort and should go with option 1. and proper indexes.

I agree with @sticky_bit that Option 2. is better than 1. There's another possibility:

Create a VIEW that's a pre-defined invocation of the JOIN / SUM query. A smart DBMS should be able to infer that each time the orders table is updated, it also needs to adjust orders_amount_total for the user_id .

BTW re your schema design: don't name columns id ; don't use the same column name in two different tables except if they mean the same thing.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM