简体   繁体   中英

Redshift - Using sort key is doubling the table size?

So I am trying to optimize some tables, the strange thing is I found out that a table which never had any sorting key defined before, the table size (blocks in MB) is 720. And when I try to define a sort key, or compound sort key, the table size literally doubles to 1440. I have tried in conjunction with different distribution styles, but that doesn't change the table size. I also ran ANALYZE COMPRESSION, but there was nothing to improve on.

So what should I do on this case? I would need the tables to have a sort key since they are used in alot of joins, but the doubling of table size it not ideal. I tried googling for this particular issue but seems there has been nothing on this problem

Edit: I do not try to implement this on an existing table, instead a created a new table like the current one, then I inserted all the data. And just to be sure, I ran statistics and vacuum on the new table.

Your table size is being determined by the minimum table size. See this support document for more information: Why does a table in my Amazon Redshift cluster consume more disk storage space than expected?

For tables created using the KEY or EVEN distribution style: Minimum table size = block_size (1 MB) * (number_of_user_columns + 3 system columns) * number_of_populated_slices * number_of_table_segments .

The minimum table size is increased when you have a sort key because Redshift allocates extra space for new unsorted data to be added to the table.

If an Amazon Redshift table has a sort key, the table has two segments—one sorted segment and one unsorted segment. If an Amazon Redshift table has no sort key, all data is unsorted, and therefore the table has one unsorted segment.

When data is added to an existing table with a sort key, the new data is maintained in a separate segment that contains unsorted data—the data is not inserted into the original sorted key segment until a VACUUM operation is performed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM