简体   繁体   中英

Redshift data storage schema

AWS Redshift is named columnar MPP database.

To me it is expected that table(relation) is splitted by columns and store columns in blocks, for example

blk0    col0_val0, col0_val1, col0_val2, ..., col0_val15
blk1    col0_val16,........................., col0_val31
...
blkn    col1_val22,..........................,col1_val50

which means in each block stores only values from one column

but after research (reference: http://www.slideshare.net/AmazonWebServices/building-your-data-warehouse-with-amazon-redshift/24 slide page 24) I see Redshift stores data in the following schema

blk0    col0_val0, col1_val0, col0_val1, col1_val1......
...
blkn    col0_val100, col1_val100 ......

multiple columns (whole row) in each block

Isn't that row oriented ? why is it called columnar storage ?

other reference, the blue zone map in the page https://blog.chartio.com/blog/understanding-interleaved-sort-keys-in-amazon-redshift-part-1

I see the reason for the confusion. Yes, you are correct that columnar databases (redshift included) store blocks of table data "columnarly", meaning any given block should only contain data from a single column. And yes, that is true for Redshift.

The links you reference are talking about compound and interleaved sort keys, which are essentially an optional indexing method that Redshift can use to make certain types of random (ie non-sequential) access much, much faster. In those cases, assuming that the sort key you choose contains multiple columns, then only in that case, do multiple column values get combined into a single block. And from a performance optimization perspective, this makes sense. If I want all my data sorted by a combination of "month_name" and "day_number" (an oversimplified example, admittedly), it makes sense that the combined sort key would want to store both of those values sequentially within the same blocks.

Hope this helps to clarify!!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM