简体   繁体   中英

What is the best way to backfill old database data to an existing Postgres database?

A new docker image was recently stood up to replace an existing postgres database. A dump was taken of the database before the old instance was shut down using the following command:

pg_dump -h localhost -p 5432 -d *dbname* -U postgres > *dbname*.pgdump

We'd like to concatenate or append this data to the new database in order to "backfill" some older historical data. The database name and schema of the two databases is identical. What is the easiest, safest way to do this? Secondly, need postgres be shut down during the process?

If overlapping primary keys or unique columns have been assigned to the new data, then there will be no clean way to merge them without putting in some work to clean that up. Assuming that hasn't happened...

The current dump file will have create statements for all the objects that already exists. If you replay that file into the current database, you will get a bunch of errors for all those objects. If you don't have it all run in one transaction, then you could simply ignore those errors. But, you might also load data in the wrong order and get foreign key violations. Those errors will be mixed in with all the other ones about existing object, so might be easy to overlook.

So what I would do is stand up an empty database server, and replay your current dump into that. Then retake the pg_dump, but with either -a or --section=data . Then you should be able to load that dump into your new database. This has two advantages, it will not dump out CREATE statements which are not needed and throw errors which would need to be ignored, and it should dump the tables in an order which will not cause foreign key violations.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM