简体   繁体   English

在 Amazon Redshift 中使用实体化视图比使用基表有什么优势?

[英]What is the advantage of using a Materialized View over a base table in Amazon Redshift?

Conceptually, I understand that materialized views are static representations of computed values, but I don't understand how that is functionally different from creating a table that contains the same pre-computed data.从概念上讲,我知道物化视图是计算值的 static 表示,但我不明白这与创建包含相同预计算数据的表在功能上有何不同。 I would think a table could be even more performant since one could add sortkeys.我认为一个表可能会更高效,因为可以添加排序键。

I had the same question myself back in the day... As I understand the main differences are:那天我自己也有同样的问题......据我所知,主要区别是:

  1. REFRESH MATERIALIZED VIEW syntax. REFRESH MATERIALIZED VIEW语法。 To re-fill a table you would have to truncate the table and run that query again in a transaction.要重新填充表,您必须截断表并在事务中再次运行该查询。 So MV is more efficient from the coding standpoint.所以从编码的角度来看,MV 更有效。

  2. MV is a dependent object in the database. MV是数据库中的依赖object。 Upstream tables (ones that are used in its definition) have to be dropped in a cascade fashion.必须以级联方式删除上游表(在其定义中使用的表)。 Changes to upstream tables are also quite limited.对上游表的更改也非常有限。 A table is independent from the query that generated it at some point of time.表独立于在某个时间点生成它的查询。 So it's a design choice.所以这是一个设计选择。 I'd say going with MV is a more conservative design.我会说使用 MV 是一种更保守的设计。

As for keys, you can specify them in the create statement (per official docs )至于密钥,您可以在创建语句中指定它们(根据官方文档

  1. A materialized view, or snapshot as they were previously known, is a table segment whose contents are periodically refreshed based on a query, either against a local or remote table.物化视图,或以前已知的快照,是一个表段,其内容根据针对本地或远程表的查询定期刷新。 Using materialized views against remote tables is the simplest way to achieve replication of data between sites.对远程表使用物化视图是实现站点间数据复制的最简单方法。

  2. It can be used as an aggregate table based on multiple tables using Joins.它可以用作基于使用 Join 的多个表的聚合表。 We can implement row level security Privileges as well我们也可以实现行级安全权限

  3. Materialized views can be used to improve the performance of a variety of queries, including those performing aggregations and transformations of the data.物化视图可用于提高各种查询的性能,包括执行数据聚合和转换的查询。

  4. Once the joining tables of MV are loaded, based on the Refresh Mechnism, MVS data gets refreshed automatically一旦加载了 MV 的连接表,基于刷新机制,MVS 数据会自动刷新

In short, MV is a one time operation accompanied by REFRESH whereas Table CTAS would be DROP & Recreate or Truncate (but insert after truncate would be slower as compared to Drop and recreate ) option.简而言之,MV 是伴随 REFRESH 的一次性操作,而表 CTAS 将是 DROP & Recreate 或 Truncate(但与 Drop 和 recreate 相比,截断后插入会更慢)选项。 The best part to choose MV is it allows 2 modes Incremental and Full Load in redshift but Incremental is restricted to a huge level no complex query/Joins/Order by/Limit/Aggregations (only basic count,max etc allowed) but still MVs becomes preferable as it has the other mode which recompute full load again using just a same REFRESH command.选择 MV 的最好的部分是它在红移中允许增量和满载两种模式,但增量被限制在一个巨大的水平上,没有复杂的查询/加入/排序/限制/聚合(只允许基本计数、最大值等),但 MV 仍然成为更可取,因为它有另一种模式,可以使用相同的 REFRESH 命令再次重新计算满载。 One sole aspect that is useful to choose CTAS over MVs is Distribution Style AUTO on Table vs Not Such Auto option in MVs meaning it will not allow redshift capability to self decide Distribution Style whenever needed.选择 CTAS 而不是 MV 的唯一有用的方面是表上的分布样式自动与 MV 中的非此类自动选项,这意味着它不允许红移功能在需要时自行决定分布样式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM