简体   繁体   中英

manually create replication slot for publication in PostgreSQL 10

I am trying to get a stream of updates for certain tables from my PostgreSQL database. The regular way of getting all updates looks like this:

You create a logical replication slot

pg_create_logical_replication_slot('my_slot', 'wal2json');

And either connect to it using pg_recvlogical or making special SQL queries. This allows you to get all the actions from the database in json (if you used wal2json plugin or similar) and then do whatever you want with that data.

But in PostgreSQL 10 we have Publication/Subscription mechanism which allows us to replicate selected tables only. This is very handy because a lot of useless data is not being sent. The process looks like this:

First, you create a publication

CREATE PUBLICATION foo FOR TABLE herp, derp;

Then you subscribe to that publication from another database

CREATE SUBSCRIPTION mysub CONNECTION <connection stuff> PUBLICATION foo;

This creates a replication slot on a master database under the hood and starts listening to updates and commit them to the same tables on a second database. This is fine if your job was to replicate some tables, but want to get a raw stream for my stuff.

As I mentioned, the CREATE SUBSCRIPTION query is creating a replication slot on the master database under the hood, but how can I create one manually without the subscription and a second database? Here the docs say:

To make this work, create the replication slot separately (using the function pg_create_logical_replication_slot with the plugin name pgoutput)

According to the docs, this is possible, but pg_create_logical_replication_slot only creates a regular replication slot. Is the pgoutput plugin responsible for all the magic? If yes, then it becomes impossible to use other plugins like wal2json with publications.

What am I missing here?

After you have created the logical replication slot and the publication, you can create a subscription this way:

CREATE SUBSCRIPTION mysub
       CONNECTION <conn stuff>
       PUBLICATION foo
       WITH (slot_name=my_slot, create_slot=false);

Not sure if this answers your question.

I have limited experience with logical replication and logical decoding in Postgres, so please correct me if below is wrong. That being said, here is what I have found:

  1. Publication support is provided by pgoutput plugin. You use it via plugin-specific options . It may be that other plugins have possibility to add the support, but I do not know whether the logical decoding plugin interface exposes sufficient details. I tested it against wal2json plugin version 9e962ba and it doesn't recognize this option.
  2. Replication slots are created independently from publications. Publications to be used as a filter are specified when fetching changes stream. It is possible to peek changes for one publication, then peek changes for another publication and observe different set of changes despite using the same replication slot (I did not find it documented and I was testing on Aurora with Postgres compatibility, so behavior could potentially vary).
  3. Plugin output seems to include all entries for begin and commit, even if transaction did not touch any of tables included in publication of interest. It does not however include changes to other tables than included in the publication.

Here is an example how to use it in Postgres 10+:

-- Create publication
CREATE PUBLICATION cdc;

-- Create slot
SELECT pg_create_logical_replication_slot('test_slot_v1', 'pgoutput');

-- Create example table
CREATE TABLE replication_test_v1
(
  id integer NOT NULL PRIMARY KEY,
  name text
);

-- Add table to publication
ALTER PUBLICATION cdc ADD TABLE replication_test_v1;

-- Insert example data
INSERT INTO replication_test_v1(id, name) VALUES
  (1, 'Number 1')
;

-- Peak changes (does not consume changes)
SELECT pg_logical_slot_peek_binary_changes('test_slot_v1', NULL, NULL, 'publication_names', 'cdc', 'proto_version', '1');

-- Get changes (consumes changes)
SELECT pg_logical_slot_get_binary_changes('test_slot_v1', NULL, NULL, 'publication_names', 'cdc', 'proto_version', '1');

To stream changes out of Postgres to other systems, you can consider using Debezium project. It is an open source distributed platform for change data capture, which among others provides a PostgreSQL connector. In version 0.10 they added support for pgoutput plugin. Even if your use case is very different from what the project offers, you can look at their code to see how they interact with replication API.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM