简体繁体 English

将数据库数据拆分为多个表而不是将所有内容存储在 PostgreSQL 的一个表中是否是个好主意？

[英]Will it be a good idea to split database data into multiple tables instead of storing everything in one table for PostgreSQL?

原文 2020-12-01 06:54:05 7 1 database/ postgresql

I have a use-case when every data row should be accessed by time and some key and number of keys may vary between 1000 - 10000. I have two options: store everything in one table and index by key and timestamp or store data for each key in a separate table, so I will end-up with thousands tables.我有一个用例，每个数据行都应该按时间访问，并且某些键和键的数量可能在 1000 到 10000 之间变化。我有两个选择：将所有内容存储在一个表中并按键和时间戳进行索引，或者为每个存储数据键入一个单独的表，所以我最终会得到数千个表。 What will be the best approach from storage and access speed perspective?从存储和访问速度的角度来看，最好的方法是什么？ There will be no need to JOIN between keys, maybe very rarely.没有必要在键之间加入，也许很少。 I'm planning to use PostgreSQL.我打算使用 PostgreSQL。

1 个解决方案

As mentioned in the comment, that would be home-spun partitioning, so you might consider declarative list partitioning.如评论中所述，这将是自旋分区，因此您可以考虑声明性列表分区。 Since 10000 partitions are already quite a lot, you could combine several keys into one partition.由于 10000 个分区已经很多了，您可以将多个键合并到一个分区中。

The main advantages of that are:其主要优点是：

it is easy to delete all data in a single partition ( DROP TABLE )删除单个分区中的所有数据很容易（ DROP TABLE ）
if the table is truly big, autovacuum will be much less painful如果桌子真的很大，autovacuum 会不会那么痛苦

However, you shouldn't expect the typical query to become faster.但是，您不应该期望典型的查询会变得更快。 Queries can become faster if查询可以变得更快，如果

they have the partitioning key in the WHERE condition and require a sequential scan (the speed of index scans is pretty much independent of the table size, except perhaps that better data locality gives you a small gain)它们在WHERE条件下具有分区键，并且需要顺序扫描（索引扫描的速度几乎与表大小无关，但更好的数据局部性可能会给您带来一点好处）
they calculate aggregates or joins along partition boundaries他们沿着分区边界计算聚合或连接

It is also a small gain if you can omit an index or an index column because of partitioning.如果您可以因为分区而省略索引或索引列，这也是一个小收获。

At any rate, I would only consider partitioning if you expect to have at least hundreds of millions of rows.无论如何，如果您希望至少有数亿行，我只会考虑分区。