简体   繁体   English

PostgreSql 14 lz4 压缩不起作用?

[英]PostgreSql 14 lz4 compression not working?

I have a PostgrSql 14 installed and I want to compress some of my data to save some of a disk space.我安装了 PostgrSql 14,我想压缩一些数据以节省一些磁盘空间。 The data is audio files (1kb - 5mb), converted to a base64 strings.数据是音频文件(1kb - 5mb),转换为 base64 字符串。 I created 3 tables:我创建了 3 个表:

CREATE TABLE t_uncompressed (
    file_name VARCHAR(50) NOT NULL PRIMARY KEY,
    file_size BIGINT,
    raw_data       TEXT
);


CREATE TABLE t_lz4 (
    file_name VARCHAR(50) NOT NULL PRIMARY KEY,
    file_size BIGINT,
    raw_data       TEXT COMPRESSION lz4
);


CREATE TABLE t_pglz (
    file_name VARCHAR(50) NOT NULL PRIMARY KEY,
    file_size BIGINT,
    raw_data       TEXT COMPRESSION pglz
);

then I inserted my data into these tables.然后我将我的数据插入到这些表中。 I checked where data was compressed and I got that 403 rows were compressed with lz4 and only one with a pglz .我检查了压缩数据的位置,发现 403 行是用lz4压缩的,只有一个是用pglz压缩的。

SELECT count(*) total, 
count(*) FILTER ( WHERE pg_column_compression(raw_data) NOTNULL) compressed, 
'lz4' compr_type
    FROM t_lz4
UNION
SELECT count(*) total, 
count(*) FILTER ( WHERE pg_column_compression(raw_data) NOTNULL) compressed, 
'pglz' compr_type
    FROM t_pglz;

 total | compressed | compr_type 
-------+------------+------------
   738 |          1 | pglz
   738 |        403 | lz4
(2 rows)

What seems to be weird for me is that the size of all three tables is the same, Okay, about uncompressed table and the pglz one, but why _lz4 table has the same size ?对我来说似乎很奇怪的是所有三个表的大小都是相同的,好的,关于未压缩表和 pglz 表,但为什么 _lz4 表具有相同的大小

I'm getting table sizes this way:我以这种方式获得表格大小:

SELECT schemaname || '.' || tablename full_tname
    , pg_size_pretty(pg_total_relation_size('"' || schemaname || '"."' || tablename || '"')) total_usage
    , pg_size_pretty(pg_relation_size('"' || schemaname || '"."' || tablename || '"')) data_size
    , pg_size_pretty((pg_total_relation_size('"' || schemaname || '"."' || tablename || '"') -
                      pg_relation_size('"' || schemaname || '"."' || tablename || '"') -
                      pg_indexes_size('"' || schemaname || '"."' || tablename || '"')))
    AS TOAST
    FROM pg_catalog.pg_tables
    WHERE tablename ~ 't_';

  full_tname         | total_usage | data_size | toast  
---------------------+-------------+-----------+--------
 t_lz4               | 338 MB      | 80 kB     | 338 MB
 t_pglz              | 338 MB      | 80 kB     | 338 MB
 t_uncompressed      | 338 MB      | 80 kB     | 338 MB
 (3 rows)

The default compression used in database is pglz, maybe this info is significant...数据库中使用的默认压缩是 pglz,也许这个信息很重要......

postgres=# SHOW default_toast_compression ;
 default_toast_compression 
---------------------------
 pglz
(1 row)

You didn't say what kind of audio files.你没有说什么样的音频文件。 Most audio file formats are already compressed, and won't compress further.大多数音频文件格式已经压缩,不会进一步压缩。 base64 encoding them does mean they should compress a little bit, to compress out the expansion caused by encoding them. base64 对它们进行编码确实意味着它们应该压缩一点,以压缩由编码它们引起的扩展。 LZ without Huffman encoding is particularly bad at compressing this type of expansion.没有哈夫曼编码的 LZ 在压缩这种扩展方面特别糟糕。 Which is what you are seeing by looking at the size of the tables: The compression is futile.这是您通过查看表的大小所看到的:压缩是徒劳的。

And the compression implementations used by PostgreSQL check for futility and give up compressing things that don't seem very compressible. PostgreSQL 使用的压缩实现检查无用性并放弃压缩看起来不太可压缩的东西。 That is what you are seeing with pg_column_compression() .这就是您在pg_column_compression()中看到的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM