Is there any optimization I can do to speed up this query. It is currently taking 30 minutes to run.
SELECT
*
FROM
service s
JOIN
bucket b ON s.procedure = b.hcpc
WHERE
month >= '201904'
AND bucket = 'Respirator'
Explain execution plan -
Gather (cost=1002.24..81397944.91 rows=9782404 width=212)
Workers Planned: 2
-> Hash Join (cost=2.24..80418704.51 rows=4076002 width=212)
Hash Cond: ((s .procedure)::text = (bac.hcpc)::text)
-> Parallel Seq Scan on service s (cost=0.00..77753288.33 rows=699907712 width=154)
Filter: ((month)::text >= '201904'::text)
-> Hash (cost=2.06..2.06 rows=14 width=58)
-> Seq Scan on buckets b (cost=0.00..2.06 rows=14 width=58)
Filter: ((bucket)::text = 'Respirator'::text)
Query optimization is something that doesn't have super hard and fast rules, it's more of a trial and error thing. Sometimes you will try a technique and it will work really well, but then the same technique will have little to no effect on another query. That being said, here are a couple of things that I would try to get you started.
SELECT *
, list out the column names that you need. If you need all of both tables, still list them out.WHERE
clause to do some preliminary filtering? Comparing only string data types is almost always a pain point in query optimization.Again, it's all trial and error, these are just a couple of places to start.
SELECT *
FROM service s JOIN
bucket b
ON s.procedure = b.hcpc
WHERE s.month >= '201904' AND
b.bucket = 'Respirator';
I would suggest indexes on:
bucket(bucket, hcpc)
service(procedure, month)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.