简体   繁体   English

是否可以使此查询更快?

[英]Is it possible to make this query faster?

Salutations, 称呼,

I am quite new to MySQL, especially producing queries, and I was wondering if it is possible to make my query execute faster? 我对MySQL非常陌生,尤其是生成查询,我想知道是否有可能使查询执行得更快? I am using the employees db available here: https://github.com/datacharmer/test_db 我正在使用此处可用的员工数据库: https : //github.com/datacharmer/test_db

Now the query I had to produce needed to answer the following: "• For each department, list number of employees born in each decade and their average salaries" 现在,我必须产生的查询需要回答以下问题:“•对于每个部门,请列出每个十年中出生的雇员数量及其平均工资”

This is what I came up with: 这是我想出的:

SELECT DISTINCT d.dept_name, count(e.emp_no), AVG(s.salary), ROUND(YEAR(e.birth_date), -1) AS birth_date 
FROM employees e, departments d, salaries s, dept_emp de 
WHERE de.emp_no = e.emp_no AND de.dept_no = d.dept_no 
    AND e.emp_no = s.emp_no 
    GROUP BY d.dept_name, 
    ROUND(YEAR(e.birth_date), -1);

It works, it produces the result the professor wanted, but it is quite slow, taking about 11 seconds to execute. 它可以工作,可以产生教授想要的结果,但是速度很慢,大约需要11秒钟才能执行。 Is there something in my query that makes it slow to execute? 我的查询中是否有某些东西使执行速度变慢?

Edit: 编辑:

Tables described: 描述的表:

mysql> explain dept_emp_latest_date;
+-----------+---------+------+-----+---------+-------+
| Field     | Type    | Null | Key | Default | Extra |
+-----------+---------+------+-----+---------+-------+
| emp_no    | int(11) | NO   |     | NULL    |       |
| from_date | date    | YES  |     | NULL    |       |
| to_date   | date    | YES  |     | NULL    |       |
+-----------+---------+------+-----+---------+-------+
3 rows in set (0.01 sec)

mysql> explain dept_manager
    -> ;
+-----------+---------+------+-----+---------+-------+
| Field     | Type    | Null | Key | Default | Extra |
+-----------+---------+------+-----+---------+-------+
| emp_no    | int(11) | NO   | PRI | NULL    |       |
| dept_no   | char(4) | NO   | PRI | NULL    |       |
| from_date | date    | NO   |     | NULL    |       |
| to_date   | date    | NO   |     | NULL    |       |
+-----------+---------+------+-----+---------+-------+
4 rows in set (0.00 sec)

mysql> explain employees;
+------------+---------------+------+-----+---------+-------+
| Field      | Type          | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+-------+
| emp_no     | int(11)       | NO   | PRI | NULL    |       |
| birth_date | date          | NO   |     | NULL    |       |
| first_name | varchar(14)   | NO   |     | NULL    |       |
| last_name  | varchar(16)   | NO   |     | NULL    |       |
| gender     | enum('M','F') | NO   |     | NULL    |       |
| hire_date  | date          | NO   |     | NULL    |       |
+------------+---------------+------+-----+---------+-------+
6 rows in set (0.00 sec)

mysql> explain salaries;
+-----------+---------+------+-----+---------+-------+
| Field     | Type    | Null | Key | Default | Extra |
+-----------+---------+------+-----+---------+-------+
| emp_no    | int(11) | NO   | PRI | NULL    |       |
| salary    | int(11) | NO   |     | NULL    |       |
| from_date | date    | NO   | PRI | NULL    |       |
| to_date   | date    | NO   |     | NULL    |       |
+-----------+---------+------+-----+---------+-------+
4 rows in set (0.00 sec)

mysql> explain titles;
+-----------+-------------+------+-----+---------+-------+
| Field     | Type        | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+-------+
| emp_no    | int(11)     | NO   | PRI | NULL    |       |
| title     | varchar(50) | NO   | PRI | NULL    |       |
| from_date | date        | NO   | PRI | NULL    |       |
| to_date   | date        | YES  |     | NULL    |       |
+-----------+-------------+------+-----+---------+-------+
4 rows in set (0.00 sec)

mysql> explain departments;
+-----------+-------------+------+-----+---------+-------+
| Field     | Type        | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+-------+
| dept_no   | char(4)     | NO   | PRI | NULL    |       |
| dept_name | varchar(40) | NO   | UNI | NULL    |       |
+-----------+-------------+------+-----+---------+-------+
2 rows in set (0.01 sec)

mysql> explain current_dept_emp;
+-----------+---------+------+-----+---------+-------+
| Field     | Type    | Null | Key | Default | Extra |
+-----------+---------+------+-----+---------+-------+
| emp_no    | int(11) | NO   |     | NULL    |       |
| dept_no   | char(4) | NO   |     | NULL    |       |
| from_date | date    | YES  |     | NULL    |       |
| to_date   | date    | YES  |     | NULL    |       |
+-----------+---------+------+-----+---------+-------+
4 rows in set (0.02 sec)

mysql> explain dept_emp;
+-----------+---------+------+-----+---------+-------+
| Field     | Type    | Null | Key | Default | Extra |
+-----------+---------+------+-----+---------+-------+
| emp_no    | int(11) | NO   | PRI | NULL    |       |
| dept_no   | char(4) | NO   | PRI | NULL    |       |
| from_date | date    | NO   |     | NULL    |       |
| to_date   | date    | NO   |     | NULL    |       |
+-----------+---------+------+-----+---------+-------+
4 rows in set (0.00 sec)

Here's your query refactored to use 21st-century JOIN syntax. 这是重构为使用21世纪JOIN语法的查询。

SELECT DISTINCT d.dept_name, count(e.emp_no), AVG(s.salary),
       ROUND(YEAR(e.birth_date), -1) AS birth_date 
  FROM employees e
  JOIN salaries s ON e.emp_no = s.emp_no
  JOIN dept_emp de  ON de.emp_no = e.emp_no 
  JOIN departments d ON de.dept_no = d.dept_no
 GROUP BY d.dept_name, ROUND(YEAR(e.birth_date), -1);

Notice that DISTINCT is redundant in an aggregate (GROUP BY) query. 请注意, DISTINCT在聚合(GROUP BY)查询中是多余的。 Getting rid of it shaves a couple of seconds. 摆脱它可以节省几秒钟。

But notice that the salaries table contains historical salary data. 但是请注意, salaries表包含历史工资数据。 Each row contains a from_date and to_date . 每行包含一个from_dateto_date The from_date column is part of the primary key of that table along with the employee number. from_date列与员工编号一起是该表的主键的一部分。 So your query averages a whole bunch of salary data indiscriminately, and pulls in far too many records. 因此,您的查询会不加选择地平均计算一堆薪水数据,并拉入太多记录。

This query takes 4.6 seconds or so (my machine is about the same speed as yours, taking 11 seconds for your first query). 此查询需要4.6秒左右的时间(我的机器与您的机器速度差不多,第一次查询需要11秒)。 And it makes more sense with the data you're given, because it extracts the salary records and department affiliation records for a particular point in time, rather than processing the whole lot. 而且,使用您所获得的数据更有意义,因为它可以提取特定时间点的薪水记录和部门隶属关系记录,而不是处理全部工作。

SELECT d.dept_name, COUNT(e.emp_no), AVG(s.salary),
       ROUND(YEAR(e.birth_date), -1) AS birth_date
  FROM employees e
  JOIN salaries s ON e.emp_no = s.emp_no
  JOIN dept_emp de ON de.emp_no = e.emp_no
  JOIN departments d ON de.dept_no = d.dept_no
 WHERE s.from_date<='2014-01-01' AND s.to_date >'2014-01-01'
   AND de.from_date<='2014-01-01' AND de.to_date >'2014-01-01'
 GROUP BY d.dept_name, ROUND(YEAR(e.birth_date), -1);

It's working on a quarter-million employee records, so it's handling 52 records per millisecond. 它正在处理25万条员工记录,因此每毫秒处理52条记录。 Not bad for a laptop. 对于笔记本电脑来说还不错。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM