[英]MySQL: Using COUNT(column_name) in the column list, and again in the HAVING clause. Does this cause the COUNT(column_name) operation to run twice?
我對在單個查詢中兩次使用COUNT(column_name)的性能感到好奇。 這是有問題的查詢:
SELECT
employee_name,
COUNT(employee_name)
FROM
employee
GROUP BY
employee_name
HAVING
COUNT(employee_name) > 1;
將
COUNT(employee_name)
被執行兩次? 此外,當我將來遇到類似問題時,我該如何自我檢查幕后情況?
謝謝!
您可以使用優化器跟蹤來獲取有關優化器如何執行查詢以及原因的更多知識。 對於這種特殊情況,跟蹤不會顯式告知計算計數的次數,但是我們可以獲得有關用於執行聚合的臨時表的信息:
mysql> SET optimizer_trace='enabled=on';
Query OK, 0 rows affected (0,00 sec)
mysql> SELECT c2, COUNT(c2) FROM temp GROUP BY c2 HAVING COUNT(c2) > 1;
+------+-----------+
| c2 | COUNT(c2) |
+------+-----------+
| 1 | 2 |
| 2 | 2 |
+------+-----------+
2 rows in set (0,00 sec)
mysql> SELECT trace->'$.steps[*].join_execution.steps[*].creating_tmp_table'
-> FROM information_schema.optimizer_trace;
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| trace->'$.steps[*].join_execution.steps[*].creating_tmp_table' |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [{"tmp_table_info": {"table": "intermediate_tmp_table", "location": "memory (heap)", "key_length": 5, "row_length": 23, "unique_constraint": false, "row_limit_estimate": 729444}}] |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0,01 sec)
mysql> SELECT c2, COUNT(c2) AS c FROM temp GROUP BY c2 HAVING c > 1;
+------+---+
| c2 | c |
+------+---+
| 1 | 2 |
| 2 | 2 |
+------+---+
2 rows in set (0,00 sec)
mysql> SELECT trace->'$.steps[*].join_execution.steps[*].creating_tmp_table' -> FROM information_schema.optimizer_trace;
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| trace->'$.steps[*].join_execution.steps[*].creating_tmp_table' |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [{"tmp_table_info": {"table": "intermediate_tmp_table", "location": "memory (heap)", "key_length": 5, "row_length": 14, "unique_constraint": false, "row_limit_estimate": 1198372}}] |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0,00 sec)
對於以上內容,我們看到當使用別名而不是重復COUNT表達式時,臨時表的行大小較小(14 vs 23字節)。 這表示對於您的查詢,計數在聚合過程中進行了兩次。
選擇任何方便的桌子,然后執行以下操作:
mysql> SELECT RAND() AS r FROM canada HAVING r < 0.1 limit 11;
+-----------------------+
| r |
+-----------------------+
| 0.6982369559800596 |
| 0.33121224616767114 |
| 0.3811396559524719 |
| 0.4718028721136999 |
也可以看看:
在ORDER BY子句中使用聚合函數和聚合函數別名之間是否存在與性能相關的差異?
我認為還有其他涉及非RAND案件的討論。
原始問題使用COUNT(employee_name)
,在兩種情況下都提供相同的值。 因此,您無法真正判斷它是否被“評估”過兩次。 通過使用RAND()
,很明顯它被重新評估了。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.