Optimizing SQL query on large table

Question

Is there any optimization I can do to speed up this query. It is currently taking 30 minutes to run.

SELECT
    *
FROM
    service s
JOIN 
    bucket b ON s.procedure = b.hcpc
WHERE 
    month >= '201904'
    AND bucket = 'Respirator'

Explain execution plan -

Gather  (cost=1002.24..81397944.91 rows=9782404 width=212)
  Workers Planned: 2
->  Hash Join  (cost=2.24..80418704.51 rows=4076002 width=212)
        Hash Cond: ((s .procedure)::text = (bac.hcpc)::text)
        ->  Parallel Seq Scan on service s   (cost=0.00..77753288.33 rows=699907712 width=154)
              Filter: ((month)::text >= '201904'::text)
        ->  Hash  (cost=2.06..2.06 rows=14 width=58)
              ->  Seq Scan on buckets b  (cost=0.00..2.06 rows=14 width=58)
                    Filter: ((bucket)::text = 'Respirator'::text)

Answer 1

Query optimization is something that doesn't have super hard and fast rules, it's more of a trial and error thing. Sometimes you will try a technique and it will work really well, but then the same technique will have little to no effect on another query. That being said, here are a couple of things that I would try to get you started.

Instead of SELECT * , list out the column names that you need. If you need all of both tables, still list them out.
Are there any numeric columns that you can use in your WHERE clause to do some preliminary filtering? Comparing only string data types is almost always a pain point in query optimization.
Look at the existing indexes on the table and see if any changes need to be made. Indexes can have a huge impact on query performance, both positive and negative depending on setup.

Again, it's all trial and error, these are just a couple of places to start.

Answer 2

SELECT *
FROM service s JOIN 
     bucket b
     ON s.procedure = b.hcpc
WHERE s.month >= '201904' AND
      b.bucket = 'Respirator';

I would suggest indexes on:

bucket(bucket, hcpc)
service(procedure, month)

Optimizing SQL query on large table

Question

2 answers

solution1
0 2021-05-04 03:59:21

solution2
0 ACCPTED 2021-05-04 12:16:21

Optimizing SQL query on large table

Question

2 answers

solution1 0 2021-05-04 03:59:21

solution2 0 ACCPTED 2021-05-04 12:16:21

solution1
0 2021-05-04 03:59:21

solution2
0 ACCPTED 2021-05-04 12:16:21