简体   繁体   中英

How to generate statistics of a table and columns of a table in snowflake?

Is there any function available like Generate Statistics in Netezza to generate the column metadata (duplicates, unique values, min value, max value etc) in snowflake.

No, not really.

You have the TABLES View which contains size(storage) and number of rows,
but the rest of the information (including the COLUMNS View ) is related to schema metadata and not data metadata.

On the other hand the table structure itself (aka micro-partitions ) contains table metadata that makes eg MIN() and MAX() functions very efficient. Some of the table statistics may be cached globally (ie in the Cloud Services part of the Snowflake architecture )

Thank you for the question on stats gathering in Snowflake. Some information:

  1. During data loading (all DMLs like COPY, INSERT/UPDATE/DELETE), these stats are already automatically gathered by Snowflake on micro-partition level.
  2. During query processing, these stats are automatically leveraged by our optimizer for query performance.
  3. Automatic background service like automatic clustering service (if enabled for a given table) will also continuously and incrementally work on fine-tuning the clustering quality of a table with those stats.

All these auto-magic features happen without user manual intervention (hence why Snowflake is known as a self-tuning, simple to use, data warehousing platform).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM