I have one query which I inherit from my previous collegue, but I need to optimize it.
This query returns 72 rows.
SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT contract_no AS c_no, cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where contract_no in (
SELECT distinct(contract_no)
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
)
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;
It takes 20 seconds to execute. I re-write it, trying to optimize it but in that case I have 75 rows (3 additional rows are returned), but result is presented in 2 seconds.
I have done like this (only difference is in one sub query):
SELECT id, contract_no, customer, address, cm_mac, aps
FROM
(
SELECT *
from new_installed_devices
where insert4date >='2018-10-28'
AND insert4date <='2018-10-28'
AND install_mark<2
) as d1
left join
(
SELECT *
from
(
SELECT distinct
(contract_no) AS c_no,
cm_mac AS c_mc, MIN(tstamp) as time2,
sum(1) as aps
from devices_change
where tstamp >= '2018-10-28 06:59:59'
AND tstamp <= '2018-10-29 07:00:00'
group by contract_no, cm_mac
) as mtmbl
where mtmbl.time2 >= '2018-10-28 06:59:59'
and mtmbl.time2 <= '2018-10-29 07:00:00'
) as tmp ON d1.contract_no=tmp.c_no
where aps>0
group by contract_no, customer, address, cm_mac;
Like you see I did not change a lot in my case but still I am getting more rows that it should be in result. Can someone please tell me the cause why my second query does not return completely correct result. I tried many things to optimize but without a success. Thanks a lot!!!
SELECT *
when you don't need all the columns. It looks like contract_no
is the only column needed from dl
, hence from new_installed_devices
.insert4date
for equality in that weird way?INDEX(insert4date, install_mark, dl)
(in that order)IN ( SELECT ... )
. Usually it is better to use EXISTS
or LEFT JOIN
.DISTINCT(contract_no), ...
-- DISTINCT
is not a function; it's effect applies to the entire set of expressions. Get rid of DISTINCT
since the GROUP BY
has that effect.INDEX(contract_no, cm_max, tstamp)
(in that order)SHOW CREATE TABLE
.new_installed_devices
, with some conditions in the WHERE clause. In older versions, MySQL doesn't handle subqueries very well, so try to avoid them in the FROM clause (especially if you have more than 1 or 2 of them).mtmbl.time2
can be folded into the subquery's HAVING clause, to make sure you filter that data as quickly as possible, without creating a large temp table with that subquery. When guessing the order MySQL will choose here, you can try to add these indexes and run the following query, to see if it works better. I applied the recommendations above to the query here below (hope my guesses about columns origins were correct, otherwise please fix everything accordingly):
ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_no_mac_tstamp` (`contract_no`,`cm_mac`,`tstamp`);
ALTER TABLE `devices_change` ADD INDEX `devices_change_idx_tstamp_no` (`tstamp`,`contract_no`);
ALTER TABLE `new_installed_devices` ADD INDEX `new_installed_device_idx_no_insert4date` (`contract_no`,`insert4date`);
The query:
SELECT
new_installed_devices.id,
new_installed_devices.contract_no,
new_installed_devices.customer,
new_installed_devices.address,
new_installed_devices.cm_mac,
new_installed_devices.aps
FROM
new_installed_devices AS d1
LEFT JOIN
(
SELECT
*
FROM
(SELECT
devices_change.contract_no AS c_no,
devices_change.cm_mac AS c_mc,
MIN(devices_change.tstamp) AS time2,
sum(1) AS aps
FROM
devices_change
WHERE
devices_change.contract_no IN (
SELECT
DISTINCT (devices_change.contract_no)
FROM
devices_change
WHERE
devices_change.tstamp >= '2018-10-28 06:59:59'
AND devices_change.tstamp <= '2018-10-29 07:00:00'
)
GROUP BY
devices_change.contract_no,
devices_change.cm_mac
HAVING
devices_change.time2 >= '2018-10-28 06:59:59'
AND devices_change.time2 <= '2018-10-29 07:00:00'
ORDER BY
NULL) AS mtmbl) AS tmp
ON d1.contract_no = tmp.c_no
WHERE
aps > 0
AND d1.insert4date >= '2018-10-28'
AND d1.insert4date <= '2018-10-28'
AND d1.install_mark < 2
GROUP BY
new_installed_devices.contract_no,
new_installed_devices.customer,
new_installed_devices.address,
new_installed_devices.cm_mac
ORDER BY
NULL
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.