简体   繁体   中英

Can't use PGPool with Amazon RDS Postgres

I have a Postgres 9.4 RDS instance with Multi-AZ, and there's a slave, read-only replica.

Up to this point the load balancing was made in the business layer of my app, but it's inefficient, and I was hoping to use PGPool, so the app interacts with a single Postgres connection.

It turns out that using PGPool has been a pain in the ass. If I set it to act as a load balancer, simple SELECT queries throw errors like:

 SQLSTATE[HY000]: General error: 7 
 message contents do not agree with length in message type "N" 
 server sent data ("D" message) 
 without prior row description ("T" message)

If I set it to act in a master/slave mode with stream replication (as suggested in Postgres mail list ) I get:

psql: ERROR:  MD5 authentication is unsupported 
in replication and master-slave modes.
HINT:  check pg_hba.conf

Yeah, well, pg_hba.conf if off hands in RDS so I can't alter it.

Has anyone got PGPool to work in RDS? Are there other tools that can act as middleware to take advantage of reading replicas in RDS?

I was able to make it work here are my working config files:

You have to use md5 authentication, and sync the username/password from your database to the pool_passwd file. Also need enable_pool_hba, load_balance_mode, and master_slave_mode on.

pgpool.conf

listen_addresses = '*'
port = 9999
pcp_listen_addresses = '*'
pcp_port = 9898
pcp_socket_dir = '/tmp'
listen_backlog_multiplier = 1
backend_hostname0 = 'master-rds-database-with-multi-AZ.us-west-2.rds.amazonaws.com'
backend_port0 = 5432
backend_weight0 = 0
backend_flag0 = 'ALWAYS_MASTER'
backend_hostname1 = 'readonly-replica.us-west-2.rds.amazonaws.com'
backend_port1 = 5432
backend_weight1 = 999
backend_flag1 = 'ALWAYS_MASTER'
enable_pool_hba = on
pool_passwd = 'pool_passwd'
ssl = on
num_init_children = 1
max_pool = 2
connection_cache = off
replication_mode = off
load_balance_mode = on
master_slave_mode = on

pool_hba.conf

local   all         all                               md5
host    all         all         127.0.0.1/32          md5

pool_passwd

username:md5d51c9a7e9353746a6020f9602d452929

to update pool_password you can use pg_md5 or

echo username:md5`echo -n usernamepassword | md5sum`
username:md5d51c9a7e9353746a6020f9602d452929 -

Output of running example:

psql --dbname=database --host=localhost --username=username --port=9999

database=> SHOW POOL_NODES;

 node_id |                        hostname                 | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
---------+-------------------------------------------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0       | master-rds-database.us-west-2.rds.amazonaws.com | 8193 | up     | 0.000000  | primary | 0          | false             | 0
 1       | readonly-replica.us-west-2.rds.amazonaws.com    | 8193 | up     | 1.000000  | standby | 0          | true              | 0

database=> select now();

 node_id |                        hostname                 | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
---------+-------------------------------------------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0       | master-rds-database.us-west-2.rds.amazonaws.com | 8193 | up     | 0.000000  | primary | 0          | false             | 0
 1       | readonly-replica.us-west-2.rds.amazonaws.com    | 8193 | up     | 1.000000  | standby | 1          | true              | 1

database=> CREATE TABLE IF NOT EXISTS tmp_test_read_write ( data varchar(40) );
CREATE TABLE
database=> INSERT INTO tmp_test_read_write (data) VALUES (concat('',inet_server_addr()));
INSERT 0 1
database=> select data as master_ip,inet_server_addr() as replica_ip from tmp_test_read_write;
  master_ip   |  replica_ip
--------------+---------------
 172.31.37.69 | 172.31.20.121
(1 row)

You can also see from the logs id does both databases:

2018-10-16 07:56:37: pid 124528: LOG:  DB node id: 0 backend pid: 21731 statement: CREATE TABLE IF NOT EXISTS tmp_test_read_write ( data varchar(40) );
2018-10-16 07:56:47: pid 124528: LOG:  DB node id: 0 backend pid: 21731 statement: INSERT INTO tmp_test_read_write (data) VALUES (concat('',inet_server_addr()));
2018-10-16 07:56:52: pid 124528: LOG:  DB node id: 1 backend pid: 24890 statement: select data as master_ip,inet_server_addr() as replica_ip from tmp_test_read_write;

Notice the insert used ip_address of master, and the next select used ip_address of the read only replica.

I can update after more testing, but psql client testing looks promising.

There is Citus(pgShard) that is supposed to work with standard Amazon RDS instances. It has catches though. You will have a single point of failure if you use the open source version. It's coordinator node is not duplicated.

You can get a fully HA seamless fail over version of it but you have to buy the enterprise licence, but it is CRAZY expensive. It will easily cost you $50,000 to $100,000 or more per year.

Also they are REALLY pushing their cloud version now which is even more insanely expensive.

https://www.citusdata.com/

I have also heard of people using HAProxy to balance between Postgres or MySql nodes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM