简体   繁体   中英

Will it be a good idea to split database data into multiple tables instead of storing everything in one table for PostgreSQL?

I have a use-case when every data row should be accessed by time and some key and number of keys may vary between 1000 - 10000. I have two options: store everything in one table and index by key and timestamp or store data for each key in a separate table, so I will end-up with thousands tables. What will be the best approach from storage and access speed perspective? There will be no need to JOIN between keys, maybe very rarely. I'm planning to use PostgreSQL.

As mentioned in the comment, that would be home-spun partitioning, so you might consider declarative list partitioning. Since 10000 partitions are already quite a lot, you could combine several keys into one partition.

The main advantages of that are:

  • it is easy to delete all data in a single partition ( DROP TABLE )

  • if the table is truly big, autovacuum will be much less painful

However, you shouldn't expect the typical query to become faster. Queries can become faster if

  • they have the partitioning key in the WHERE condition and require a sequential scan (the speed of index scans is pretty much independent of the table size, except perhaps that better data locality gives you a small gain)

  • they calculate aggregates or joins along partition boundaries

It is also a small gain if you can omit an index or an index column because of partitioning.

At any rate, I would only consider partitioning if you expect to have at least hundreds of millions of rows.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM