简体   繁体   中英

How to fix PostgreSQL generic plan estimate for any($1) array parameter?

I've run into an issue where PostgreSQL (13.1 on windows, default config) would start preferring a generic plan after 5 executions because the plan estimate is clearly wrong. Thing is, the custom plan is up to 10 times faster than the generic plan, and at least 2.5 times faster in the "worst" case. I don't want to use set plan_cache_mode = force_custom_plan, because the query is part of a software run in parallel with connection pooling over several threads, as well as dozens of smaller and larger queries, and it would feel like doing a hacky workaround that might also be dangerous for other queries.

I think I've tracked it down to the generic plan estimate being wrong for the x.id = ANY($1) parts in the where clause which I'Ve got three of.

Query goes like this:

SELECT... FROM... WHERE... x.id = ANY($1) AND y.id = ANY($2) AND z.id = ANY($3)...

of course, this is hugely simplified, but I know that I pass an int8[] of 50 entries for the first parameter 99% of the time (because the query is for a paged view of records, where one page has 50 records), which is correctly estimated as 50 rows for the custom query plan, but 10 rows for the generic query plan.

Switching to using IN($1, $2, ... $50) fixes this, but we're currently trying to move over our IN clauses to ANY, because it's more efficient over JDBC and we've been bitten by the parameter limit a few times with IN (which doesn't happen with ANY, as then it's only one parameter). Also, this would make the count of parameters variable, so the query planner would get a different query quite often (only $1 is pretty much always an array of 50 values on our prod system, the others may have less or more depending on lots of factors).

So far I've tried, without much success:

  • casting things to try to hint the planner
  • create statistics on various involved tables
  • increase statistics target on involved columns
  • using unnest (this fixes the estimate, but then the query is about 20 times slower in any case, so not really a solution for me) and a few dozen other "hacky" workarounds like using regex split to table and the like.

Note that I do know why it uses a generic plan after the 5th execution - because the estimate of that is lower than the average estimate of the 5 previous custom plans. But I don't know how to fix that wrong estimate.

Here's the relevant rows of the custom plan:

                                                                            ->  Bitmap Heap Scan on sbuilding b  (cost=150.51..332.59 rows=50 width=29) (actual time=0.077..0.187 rows=50 loops=2)
                                                                                Recheck Cond: (id = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50}'::bigint[]))
                                                                                Filter: (deletedat IS NULL)
                                                                                Heap Blocks: exact=50
                                                                                ->  Bitmap Index Scan on sbuilding_pkey  (cost=0.00..150.50 rows=50 width=0) (actual time=0.065..0.066 rows=50 loops=2)
                                                                                      Index Cond: (id = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50}'::bigint[]))

And the same part for the generic plan:

                                                                    ->  Index Scan using sbuilding_pkey on sbuilding b  (cost=0.28..82.84 rows=10 width=29) (actual time=0.049..0.229 rows=50 loops=2)
                                                                          Index Cond: (id = ANY ($2))
                                                                          Filter: (deletedat IS NULL)

I've replaced actual real ids with numbers 1-50, but the real ids all exist, as they are retrieved from the db in a step before this one.

I'd prefer not having to post the full query or query plans, as those contain a lot of sensitive information (would be hundreds of words and values to replace for me), but I hope these parts are enough to show the issue and hopefully one of you will be able to assist.

Edit: I don't have any array extension like intArray installed explicitly - found a mailing list thread regarding estimates with intArray extension, so I thought I'd add this. Not sure though if it still applies? Here's the link: https://www.postgresql.org/message-id/20150317191531.GE10492%40momjian.us

Thanks a lot!

Edit2: I just retested with a super simple query and it seems that the query planner always assumes 10 values for = ANY(:arrayParam) for generic plans. This seems to be the culprit in my case. Is there any way to "fix" that assumption? Preferable without installing any extensions that need config changes like pg_hint_plan. Thx!

Edit3: found the following in postgres source code selfuncs.c:

/*
         * Arbitrarily assume 10 elements in the eventual array value (see
         * also estimate_array_length).  We don't risk an assumption of
         * disjoint probabilities here.
         */
        for (i = 0; i < 10; i++)
        {
            if (useOr)
                s1 = s1 + s2 - s1 * s2;
            else
                s1 = s1 * s2;
        }

I'll see where I can get from there, but I still hope someone has a solution that does not involve me going through the whole postgres source code;)

That can be done by changing plan_cache_mode for this query:

BEGIN;
SET LOCAL plan_cache_mode = force_custom_plan;
SETECT /* your query */
COMMIT;

That will make the optimizer always use a custom plan.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM