简体   繁体   中英

SQL query is slow. How can I make it faster?

I have the following tables which is essentially a system to tell me who was in what room:

CREATE TABLE customers
    (`cus_id` int PRIMARY KEY, `name` varchar(5), `driver_id` int)
;
    
INSERT INTO customers
    (`cus_id`, `name`, `driver_id`)
VALUES
    (1, 'bob', 11111),
    (2, 'james', 22222),
    (3, 'sam', 33333),
    (4, 'billy', 44444)
;


CREATE TABLE hotel_rooms
    (`hroom_id` int PRIMARY KEY, `name` varchar(10), `cus_id` int)
;
    
INSERT INTO hotel_rooms
    (`hroom_id`, `name`, `cus_id`)
VALUES
    (1, 'small room', 3),
    (2, 'big room', 1)
;


CREATE TABLE snapshots
    (`snapshot_id` int PRIMARY KEY, `hroom_id` int, FOREIGN KEY (hroom_id) REFERENCES hotel_rooms (hroom_id), `date_added` datetime)
;
    
INSERT INTO snapshots
    (`snapshot_id`, `hroom_id`, `date_added`)
VALUES
    (1, 1, '2020-01-12 12:43:13'),
    (2, 1, '2020-01-13 17:23:53'),
    (3, 2, '2020-01-19 07:34:01')
;


CREATE TABLE participants
    (`participant_id` int PRIMARY KEY, `snapshot_id` int, FOREIGN KEY (snapshot_id) REFERENCES snapshots (snapshot_id), `cus_id` int)
;
    
INSERT INTO participants
    (`participant_id`, `snapshot_id`, `cus_id`)
VALUES
    (1, 1, 1),
    (2, 1, 3),
    (3, 2, 1),
    (4, 2, 2),
    (5, 2, 3),
    (6, 3, 1),
    (7, 3, 4)
;

My SQL statement:

SELECT s.snapshot_id, 
       hr.name, 
       c1.driver_id AS owner_driver_id,
       md.max_date AS date_added,
       GROUP_CONCAT(c2.driver_id) AS participants_driver_ids 
FROM snapshots s
JOIN (
  SELECT hr.hroom_id, MAX(date_added) AS max_date
  FROM hotel_rooms hr
  JOIN snapshots s ON s.hroom_id = hr.hroom_id
  JOIN participants p ON p.snapshot_id = s.snapshot_id
  JOIN customers c ON c.cus_id = p.cus_id
  WHERE c.cus_id = 1
  GROUP BY hr.hroom_id, hr.name
) md ON md.hroom_id = s.hroom_id AND md.max_date = s.date_added
JOIN hotel_rooms hr ON hr.hroom_id = s.hroom_id
JOIN customers c1 ON c1.cus_id = hr.cus_id
JOIN participants p ON p.snapshot_id = s.snapshot_id
JOIN customers c2 ON c2.cus_id = p.cus_id
GROUP BY s.snapshot_id, hr.name, c1.driver_id, md.max_date
;

SQL code of tables and statement: http://www.sqlfiddle.com/#!9/6844de/1

Desired output: 在此处输入图片说明

Essentially what the participant table says is:

  • snapshot_id=1 bob and sam where in small room .
  • snapshot_id=2 , bob , james and sam where in small room .
  • snapshot_id=3 , bob and billy where in big room .

Execution plan: 在此处输入图片说明

However the query is slow. I don't understand what I need to index to make this query faster because they are all basically join statements.

Use DATETIME , not VARCHAR for datetime values.

Please provide EXPLAIN SELECT...

Experiment with the subquery -- this will help decide whether the sluggishness is with the subquery or the rest.

snapshot needs a composite index INDEX(hroom, date_added) with the columns in that order.

The query selects all snapshots made the same day customer #1 had the last snapshot for the same hotel room.

I'd probably write the subquery as

SELECT hroom_id, MAX(date_added) AS max_date
FROM snapshots s
WHERE snapshot_id IN (SELECT snapshot_id FROM participants WHERE cus_id = 1)
GROUP BY hroom_id

This gets the intention clear and I removed two tables that are not needed for this evaluation.

Indexes for this subquery:

create index idx1 on participants (cus_id, snapshot_id);
create index idx2 on snapshots (snapshot_id, hroom_id, date_added);

Indexes for the main query:

create index idx3 on snapshots (hroom_id, date_added, snapshot_id);
create index idx4 on hotel_rooms (hroom_id, name);
create index idx5 on customers (cus_id, driver_id);
create index idx6 on participants (snapshot_id, cus_id);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM