简体   繁体   中英

How to synchronize a postgresql database with data from mysql database?

I have an application at Location A (LA-MySQL) that uses a MySQL database; And another application at Location B (LB-PSQL) that uses a PostgreSQL database. (by location I mean physically distant places and different networks if it matters)

I need to update one table at LB-PSQL to be synchronized with LA-MySQL but I don't know exactly which are the best practices in this area.

Also, the table I need to update at LB-PSQL does not necessarily have the same structure of LA-MySQL. (but I think that isn't a problem since the fields I need to update on LB-PSQL are able to accommodate the data from LA-MySQL fields)

Given this data, which are the best practices, usual methods or references to do this kind of thing?

Thanks in advance for any feedback!

If both servers are in different networks, the only chance I see is to export the data into a flat file from MySQL.

Then transfer the file (eg FTP or something similar) to the PostgreSQL server and import it there using COPY

I would recommend to import the flat file into a staging table. From there you can use SQL to move the data to the approriate target table. That will give you the chance to do data conversion or do updates on existing rows.

If that transformation is more complicated you might want to think about using an ETL tool (eg Kettle) to do the migration on the target server .

Not a turnkey solution, but this is some code to help with this task using triggers. The following assumes no deletes or updates for brevity. Needs PG>=9.1

1) Prepare 2 new tables. mytable_a, and mytable_b. with the same columns as the source table to be replicated:

CREATE TABLE  mytable_a AS TABLE mytable WITH NO DATA;
CREATE TABLE  mytable_b AS TABLE mytable WITH NO DATA;

-- trigger function which copies data from mytable to mytable_a on each insert
CREATE OR REPLACE FUNCTION data_copy_a() RETURNS trigger AS $data_copy_a$
    BEGIN
    INSERT INTO mytable_a SELECT NEW.*;
        RETURN NEW;
    END;
$data_copy_a$ LANGUAGE plpgsql;

-- start trigger
CREATE TRIGGER data_copy_a AFTER INSERT ON mytable FOR EACH ROW EXECUTE PROCEDURE data_copy_a();

Then when you need to export:

-- move data from mytable_a -> mytable_b without stopping trigger
WITH d_rows AS (DELETE FROM mytable_a RETURNING * )  INSERT INTO mytable_b SELECT * FROM d_rows; 

-- export data from mytable_b -> file
\copy mytable_b to '/tmp/data.csv' WITH DELIMITER ',' csv; 

-- empty table
TRUNCATE mytable_b;

Then you may import the data.csv to mysql.

Just create a script on LA that will do something like this (bash sample):

TMPFILE=`mktemp` || (echo "mktemp failed" 1>&2; exit 1)
pg_dump --column-inserts --data-only --no-password \
  --host="LB_hostname" --username="username" \
  --table="tablename" "databasename" \
  awk '/^INSERT/ {i=1} {if(i) print} # ignore everything to first INSERT' \
  > "$TMPFILE" \
  || (echo "pg_dump failed" 1>&2; exit 1)
(echo "begin; truncate tablename;"; cat "$TMPFILE"; echo 'commit;' ) \
  | mysql "databasename" < "$TMPFILE" \
  || (echo "mysql failed" 1>&2; exit 1) \
rm "$TMPFILE"

And set it to run for example once a day in cron. You'd need a '.pgpass' for postgresql password and mysql option file for mysql password.

This should be fast enough for a less than a million of rows.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM