[英]Mysql performance, one table or two
I use PHP and mysql. 我使用PHP和mysql。
Let's say I have a database table with 10 000 rows. 假设我有一个包含1万行的数据库表。 Which of the cases below it the best performance wise? 以下哪种情况是最好的性能选择?
Two tables, products
and categories
. 两个表, products
和categories
。
SELECT * FROM products INNER JOIN categories ON products.category_id = categories.id
Products 产品展示
id
name
category_id
Categories 分类目录
id
name
One table, products
, containing all the data. 一个表products
包含所有数据。
SELECT * FROM products
Products 产品展示
id
name
category_name
From my perspective Case 1
is the "correct" way of doing it, but I will save some developing time by using Case 2
. 在我看来, Case 1
是做到这一点的“正确”方法,但是我将通过使用Case 2
节省一些开发时间。 Maybe performance too? 也许也是表现?
This is too long for a comment. 这个评论太长了。 The first is the correct (ie SQLish) way of storing this data. 第一种是正确的(即SQLish)存储数据的方式。 It allows you do do the following: 它允许您执行以下操作:
Performance is not the main consideration. 性能不是主要考虑因素。 The SQL engine takes care of performance through the use of fancy join algorithms and indexes. SQL引擎通过使用花式联接算法和索引来确保性能。 It does this so you can structure the data in the most sensible and maintainable way for your application. 这样做是为了让您可以以最明智和可维护的方式为应用程序构建数据。
That said, which performs better depends on a number of factors (how long the category names are, how many different names there are, how wide the product record is). 也就是说,效果更好取决于许多因素(类别名称有多长时间,有多少个不同的名称,产品记录的宽度)。 Differences in performance between the two scenarios are probably not at all important in getting an application to work optimally. 两种方案之间的性能差异对于使应用程序最佳运行可能根本不重要。
Case 1 is better than 2 because if you would implement case 2 you would end up with double data. 情况1优于情况2,因为如果实施情况2,您将得到双倍数据。 By double data I mean that you would have multiple times the same value in the "category_name" field. 用双倍数据表示您将在“ category_name”字段中多次使用相同的值。 This is bad for two reasons, first because it will slow down performance because of too many, unnecessary data (double data). 这是不好的,有两个原因,首先是因为过多的不必要数据(双数据)会降低性能。 The second reason is because of efficiency. 第二个原因是因为效率。 Suppose you would like to change a category name like drinks to drink it would take way more time in the 2nd case than in the 1st case. 假设您想更改饮料等类别名称,第二种情况比第一种情况花费的时间更多。 So to answer your first question, case 1 is the way to do it. 因此,要回答您的第一个问题,情况1是解决问题的方法。
And as you can imagine by reading my answer to question one case 1 is faster than case 2 because case 2 has unnecessary data. 正如您可以想象的那样,通过阅读我对问题的回答,情况1比情况2更快,因为情况2具有不必要的数据。
And your last question, like I explained in my answer of question one, one pitfall of case 2 is is you would like to change a category name you would end up with way more work than in case 1. Case 1 has by my knowledge no pitfalls. 正如您在第一个问题的答案中所解释的那样,您的最后一个问题是情况2的一个陷阱是您想更改类别名称,结果比情况1的工作量更大。情况1据我所知陷阱。
I think the question id database design
centric. 我认为问题ID以database design
为中心。
Now answer to your questions: 现在回答您的问题:
Which case will give the best performance? 哪种情况下效果最佳?
Answer - Case 1. 答案-情况1。
Why? 为什么?
SQL
rule of Normalization
which will help you in longer run.If in future you have more than 10,000 rows then it will be tedious to handle it in the single table with redundant data
. 它遵循Normalization
的基本SQL
规则,该规则将帮助您延长运行时间。如果将来您的行数超过10,000,则在具有redundant data
的单个表中处理它将很繁琐。 indexing
over the key
columns, it will help you in executing join
queries faster over large number of rows. 如果您对key
列进行indexing
,它将帮助您在大量行上更快地执行join
查询。 redundancy
. 两个单独的表将帮助您减少数据redundancy
。 Why not case 2? 为什么不案例2?
There will be violation of the Normalization
rule with the single table.Your example shows it that with the single table it will violate these rule. 单个表将违反Normalization
规则。您的示例显示,单个表将违反这些规则。
Will it take long to get 10,000 rows with a structure like it? 用这样的结构获取10,000行需要很长时间吗?
With case 1: It will take a bit long time than the Case 2
as there will be join
queries involved.But this time will be negligible
and can be reduced by using indexing
as well. 随着案例1:这将需要一段日子,比Case 2
,因为将join
查询涉及到了,这段时间将是negligible
,并且可以通过减少indexing
为好。
With case 2: It will take bit less time than the Case 1
but it's performance may lack due to redundant data
or as when the number of records will grow. 对于情况2:所需时间比Case 1
但是由于redundant data
或随着记录数量的增加,它的性能可能会下降。
Possible pitfalls? 可能的陷阱?
With case 1 - 对于案例1-
join
queries for some difficult scenario. 对于某些困难的情况,您可能最终编写复杂的join
查询。 With case 2 - 对于案例2-
Hope this help you. 希望这对您有所帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.