简体   繁体   中英

Is estimated row count accurate when only inserts are done in a table?

We use PostgreSQL for analytics. Three typical operations we do on tables are:

  • Create table as select
  • Create table followed by insert in table
  • Drop table

We are not doing any UPDATE , DELETE etc.

For this situation can we assume that estimates would just be accurate?

SELECT reltuples AS estimate FROM pg_class where relname = 'mytable';

With autovacuum running (which is the default), ANALYZE and VACUUM are fired up automatically - both of which update reltuples . Basic configuration parameters for ANALYZE (which typically runs more often), ( quoting the manual ):

autovacuum_analyze_threshold ( integer )

Specifies the minimum number of inserted, updated or deleted tuples needed to trigger an ANALYZE in any one table. The default is 50 tuples. This parameter can only be set in the postgresql.conf file or on the server command line; but the setting can be overridden for individual tables by changing table storage parameters.

autovacuum_analyze_scale_factor ( floating point )

Specifies a fraction of the table size to add to autovacuum_analyze_threshold when deciding whether to trigger an ANALYZE . The default is 0.1 (10% of table size). This parameter can only be set in the postgresql.conf file or on the server command line; but the setting can be overridden for individual tables by changing table storage parameters.

Another quote gives insight to details:

For efficiency reasons, reltuples and relpages are not updated on-the-fly, and so they usually contain somewhat out-of-date values. They are updated by VACUUM , ANALYZE , and a few DDL commands such as CREATE INDEX . A VACUUM or ANALYZE operation that does not scan the entire table (which is commonly the case) will incrementally update the reltuples count on the basis of the part of the table it did scan, resulting in an approximate value. In any case, the planner will scale the values it finds in pg_class to match the current physical table size, thus obtaining a closer approximation.

Estimates are up to date accordingly. You can change autovacuum settings to be more aggressive. You can even do this per table. See:

On top of that, you can scale estimates like Postgres itself does it. See:

Note that VACUUM (of secondary relevance to your case) wasn't triggered by only INSERT s before Postgres 13. Quoting the release notes:

  • Allow inserts, not only updates and deletes, to trigger vacuuming activity in autovacuum (Laurenz Albe, Darafei Praliaskouski)

    Previously, insert-only activity would trigger auto-analyze but not auto-vacuum, on the grounds that there could not be any dead tuples to remove. However, a vacuum scan has other useful side-effects such as setting page-all-visible bits, which improves the efficiency of index-only scans. Also, allowing an insert-only table to receive periodic vacuuming helps to spread out the work of “freezing” old tuples, so that there is not suddenly a large amount of freezing work to do when the entire table reaches the anti-wraparound threshold all at once.

    If necessary, this behavior can be adjusted with the new parameters autovacuum_vacuum_insert_threshold and autovacuum_vacuum_insert_scale_factor , or the equivalent table storage options.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM