简体   繁体   English

MySQL性能,一两张表

[英]Mysql performance, one table or two

I use PHP and mysql. 我使用PHP和mysql。

Let's say I have a database table with 10 000 rows. 假设我有一个包含1万行的数据库表。 Which of the cases below it the best performance wise? 以下哪种情况是最好的性能选择?

Case 1 情况1

Two tables, products and categories . 两个表, productscategories

SELECT * FROM products INNER JOIN categories ON products.category_id = categories.id

Products 产品展示

id
name
category_id

Categories 分类目录

id
name

Case 2 情况二

One table, products , containing all the data. 一个表products包含所有数据。

SELECT * FROM products

Products 产品展示

id
name
category_name

Question(s) 问题

  • Which of these cases have the best performance? 以下哪种情况下效果最佳?
  • Guess, would it take long to get data with 10 000 rows with a structure like it? 猜猜,用这样的结构获取具有1万行的数据会花费很长时间吗?
  • Any pitfalls with one of the cases? 在其中一种情况下有什么陷阱吗?

From my perspective Case 1 is the "correct" way of doing it, but I will save some developing time by using Case 2 . 在我看来, Case 1是做到这一点的“正确”方法,但是我将通过使用Case 2节省一些开发时间。 Maybe performance too? 也许也是表现?

This is too long for a comment. 这个评论太长了。 The first is the correct (ie SQLish) way of storing this data. 第一种是正确的(即SQLish)存储数据的方式。 It allows you do do the following: 它允许您执行以下操作:

  • Validate the category names as they are inserted and updated, using standard foreign key relationships. 使用标准外键关系在插入和更新类别名称时对其进行验证。
  • Change a category name and have it affect all products. 更改类别名称,并使其影响所有产品。
  • Include other information about a category, such as short names, long descriptions, date added, and so on. 包括有关类别的其他信息,例如短名称,长描述,添加日期等。

Performance is not the main consideration. 性能不是主要考虑因素。 The SQL engine takes care of performance through the use of fancy join algorithms and indexes. SQL引擎通过使用花式联接算法和索引来确保性能。 It does this so you can structure the data in the most sensible and maintainable way for your application. 这样做是为了让您可以以最明智和可维护的方式为应用程序构建数据。

That said, which performs better depends on a number of factors (how long the category names are, how many different names there are, how wide the product record is). 也就是说,效果更好取决于许多因素(类别名称有多长时间,有多少个不同的名称,产品记录的宽度)。 Differences in performance between the two scenarios are probably not at all important in getting an application to work optimally. 两种方案之间的性能差异对于使应用程序最佳运行可能根本不重要。

Case 1 is better than 2 because if you would implement case 2 you would end up with double data. 情况1优于情况2,因为如果实施情况2,您将得到双倍数据。 By double data I mean that you would have multiple times the same value in the "category_name" field. 用双倍数据表示您将在“ category_name”字段中多次使用相同的值。 This is bad for two reasons, first because it will slow down performance because of too many, unnecessary data (double data). 这是不好的,有两个原因,首先是因为过多的不必要数据(双数据)会降低性能。 The second reason is because of efficiency. 第二个原因是因为效率。 Suppose you would like to change a category name like drinks to drink it would take way more time in the 2nd case than in the 1st case. 假设您想更改饮料等类别名称,第二种情况比第一种情况花费的时间更多。 So to answer your first question, case 1 is the way to do it. 因此,要回答您的第一个问题,情况1是解决问题的方法。

And as you can imagine by reading my answer to question one case 1 is faster than case 2 because case 2 has unnecessary data. 正如您可以想象的那样,通过阅读我对问题的回答,情况1比情况2更快,因为情况2具有不必要的数据。

And your last question, like I explained in my answer of question one, one pitfall of case 2 is is you would like to change a category name you would end up with way more work than in case 1. Case 1 has by my knowledge no pitfalls. 正如您在第一个问题的答案中所解释的那样,您的最后一个问题是情况2的一个陷阱是您想更改类别名称,结果比情况1的工作量更大。情况1据我所知陷阱。

I think the question id database design centric. 我认为问题ID以database design为中心。

Now answer to your questions: 现在回答您的问题:

  1. Which case will give the best performance? 哪种情况下效果最佳?

    Answer - Case 1. 答案-情况1。

    Why? 为什么?

    • It is following the basic SQL rule of Normalization which will help you in longer run.If in future you have more than 10,000 rows then it will be tedious to handle it in the single table with redundant data . 它遵循Normalization的基本SQL规则,该规则将帮助您延长运行时间。如果将来您的行数超过10,000,则在具有redundant data的单个表中处理它将很繁琐。
    • If you do indexing over the key columns, it will help you in executing join queries faster over large number of rows. 如果您对key列进行indexing ,它将帮助您在大量行上更快地执行join查询。
    • Two separate tables will help you in reducing data redundancy . 两个单独的表将帮助您减少数据redundancy

    Why not case 2? 为什么不案例2?

    There will be violation of the Normalization rule with the single table.Your example shows it that with the single table it will violate these rule. 单个表将违反Normalization规则。您的示例显示,单个表将违反这些规则。

  2. Will it take long to get 10,000 rows with a structure like it? 用这样的结构获取10,000行需要很长时间吗?

    With case 1: It will take a bit long time than the Case 2 as there will be join queries involved.But this time will be negligible and can be reduced by using indexing as well. 随着案例1:这将需要一段日子,比Case 2 ,因为将join查询涉及到了,这段时间将是negligible ,并且可以通过减少indexing为好。

    With case 2: It will take bit less time than the Case 1 but it's performance may lack due to redundant data or as when the number of records will grow. 对于情况2:所需时间比Case 1但是由于redundant data或随着记录数量的增加,它的性能可能会下降。

  3. Possible pitfalls? 可能的陷阱?

    With case 1 - 对于案例1-

    • You may end up writing complex join queries for some difficult scenario. 对于某些困难的情况,您可能最终编写复杂的join查询。

    With case 2 - 对于案例2-

    • Data redundancy / duplication 数据冗余/重复
    • Low performance in longer run 长期运行时性能低下
    • Poor readability 可读性差

Hope this help you. 希望这对您有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM