Cost of SORT is slowing down my query

Question

PostgreSQL 7.4 (Yep upgrading)

So in my WHERE condition I have this

AND CASE
    WHEN "substring"(t."FieldID"::text, 0, 3) = '01'::text 
        OR "substring"(t."FieldID"::text, 0, 4) = '123'::text 
        OR "substring"(t."FieldID"::text, 0, 5) = '5555'::text 
        OR "substring"(t."FieldID"::text, 0, 6) = '44444'::text 
        OR "substring"(t."FieldID"::text, 0, 3) = '99'::text 
    THEN 1
    ELSE 0
END = 1

Alternate syntax but no change in Cost

AND CASE
    WHEN "substring"(t."FieldID"::text, 0, 3) = '01'::text THEN 1
    WHEN "substring"(t."FieldID"::text, 0, 4) = '123'::text THEN 1
    WHEN "substring"(t."FieldID"::text, 0, 5) = '5555'::text THEN 1
    WHEN "substring"(t."FieldID"::text, 0, 6) = '44444'::text THEN 1
    WHEN "substring"(t."FieldID"::text, 0, 3) = '99'::text THEN 1    
    ELSE 0
END = 1

Looking for a Cost effective way to limit the results by the start of a string. So if the string starts with 01, 123, 5555, 44444 or 99 add it to the result set.

Any thoughts?

Note: the FieldID is indexed Viewing the Explain data to see the bottle necks in the query, when adding the above code is when the cost of the Sort goes way up and slows the return of the data set/results.

Output from Explain:

Sort (cost=88716.84..88719.89 rows=822 width=64)

there is a ton more as the query is complex but if I remove the part of the code the Sort cost goes way down

Answer 1

If you are just filtering by the starting chars, you can use like with no problem and it will just use an index.

AND (t."FieldID"::text LIKE '01%' OR 
     t."FieldID"::text LIKE '123%' OR 
     t."FieldID"::text LIKE '5555%' OR
     t."FieldID"::text LIKE '44444%' OR
     t."FieldID"::text LIKE '99%')

Answer 2

you might get some traction by defining expression indexes that match the query; something like

CREATE INDEX t_fieldid_prefix_3 ON t (("substring"("FieldID"::text, 0, 3)))
CREATE INDEX t_fieldid_prefix_4 ON t (("substring"("FieldID"::text, 0, 4)))
CREATE INDEX t_fieldid_prefix_5 ON t (("substring"("FieldID"::text, 0, 5)))
CREATE INDEX t_fieldid_prefix_6 ON t (("substring"("FieldID"::text, 0, 6)))

If you're always looking for the same prefixes, include the whole thing in the index:

CREATE INDEX t_fieldid_prefix ON t((CASE
    WHEN "substring"("FieldID"::text, 0, 3) = '01'::text 
        OR "substring"("FieldID"::text, 0, 4) = '123'::text 
        OR "substring"("FieldID"::text, 0, 5) = '5555'::text 
        OR "substring"("FieldID"::text, 0, 6) = '44444'::text 
        OR "substring"("FieldID"::text, 0, 3) = '99'::text 
    THEN 1
    ELSE 0
END))

Answer 3

I have no idea whether this is supported by your ancient version, but you could try to create an index on the sort expression to see if that improves the query:

CREATE INDEX idx_case ON the_table (
  (CASE
      WHEN substring("FieldID", 0, 3) = '01' THEN 1
      WHEN substring("FieldID", 0, 4) = '123' THEN 1
      WHEN substring("FieldID", 0, 5) = '5555' THEN 1
      WHEN substring("FieldID", 0, 6) = '44444' THEN 1
      WHEN substring("FieldID", 0, 3) = '99' THEN 1    
      ELSE 0
  END));

With a current version I'm pretty sure this could be used to improve the ORDER BY step

Answer 4

Depending on how often this sort of query is run, and also on how much data there is, you might consider calculating some of this external to the query and adding extra columns to use as indexes only. The same way a data warehouse denormalizes to speed up reporting queries.

Cost of SORT is slowing down my query

Question

4 answers

solution1
2 ACCPTED 2011-09-13 21:30:40

solution2
1 2011-09-13 21:20:34

solution3
1 2011-09-13 21:21:15

solution4
0 2011-09-13 21:12:33

Cost of SORT is slowing down my query

Question

4 answers

solution1 2 ACCPTED 2011-09-13 21:30:40

solution2 1 2011-09-13 21:20:34

solution3 1 2011-09-13 21:21:15

solution4 0 2011-09-13 21:12:33

solution1
2 ACCPTED 2011-09-13 21:30:40

solution2
1 2011-09-13 21:20:34

solution3
1 2011-09-13 21:21:15

solution4
0 2011-09-13 21:12:33