psql：致命：无法从GTM获取事务ID。 GTM可能已失败或丢失连接

Question

I want to create a postgres-xl cluster. 我想创建一个postgres-xl集群。 The cluster includes 5 nodes, 1 GTM, 2 Coordinator and 2 Datanodes. 该集群包括5个节点，1个GTM，2个协调器和2个Datanode。 The following are the details of nodes 以下是节点的详细信息

GTM:   
hostname=localhost  
nodename=gtm  
IP=127.0.0.1  
port=20001

Coordinator1：  
hostname=localhost  
nodename=coord1  
IP=127.0.0.1  
pooler_port=30011，port=30001  

Coordinator2：  
hostname=host2  
nodename=coord2  
IP=10.4.6.36  
pooler_port=30012，port=30002  

Datanode1：  
hostname=localhost  
nodename=dn1  
IP=127.0.0.1  
pooler_port=40011, port=40001  

Datanode2：  
hostname=host2  
nodename=dn2  
IP=10.4.6.36  
pooler_port=40012, port=40002

I have installed pgxc_ctl and added /usr/local/pgsql/bin to PATH for postgres. 我已经安装了pgxc_ctl并将/ usr / local / pgsql / bin添加到PATH for postgres。 I have Configured ssh authentication to avoid inputting the password for pgxc_ctl. 我已配置ssh身份验证以避免输入pgxc_ctl的密码。 I have edited postgresql.conf and pg_hba.conf on both nodes. 我在两个节点上编辑了postgresql.conf和pg_hba.conf。

Then I built the cluster as follows: 然后我按如下方式构建了集群：

$ pgxc_ctl
PGXC$  add gtm master gtm localhost 20001 $dataDirRoot/gtm    
PGXC$  add coordinator master coord1 localhost 30001 30011 
       $dataDirRoot/coord_master.1 none none
PGXC$  add coordinator master coord2 10.4.6.36 30002 30012 
       $dataDirRoot/coord_master.2 none none

after adding coord2, i got the following 添加coord2后，我得到以下内容

psql: FATAL: Could not obtain a transaction ID from GTM. psql：致命：无法从GTM获取事务ID。 The GTM might have failed or lost connectivity GTM可能已失败或丢失连接

PGXC$  add datanode master dn1 localhost 40001 40011 
       $dataDirRoot/dn_master.1 none none none
PGXC$  add datanode master dn2 10.4.6.36 40002 40012 
       $dataDirRoot/dn_master.2 none none none

after adding dn2, I got the following error 添加dn2后，我收到以下错误

ERROR: Failed to get pooled connections HINT: This may happen because one or more nodes are currently unreachable, either because of node or network failure. 错误：无法获得池化连接提示：这可能是因为一个或多个节点当前无法访问，原因可能是节点或网络故障。 It's also possible that the target node may have hit the connection limit or the pooler is configured with low connections. 目标节点也可能已经达到连接限制，或者池中的连接配置较低。 Please check if all nodes are running fine and also review max_connections and max_pool_size configuration parameters 请检查所有节点是否正常运行，还要检查max_connections和max_pool_size配置参数

But when I monitor all the nodes, it shows 但是，当我监控所有节点时，它会显示出来

PGXC$  monitor all
Running: gtm master
Running: coordinator master coord1
Running: coordinator master coord2
Running: datanode master dn1
Running: datanode master dn2

I could not connect to coord2 by running 我无法通过运行连接到coord2

 psql -h 10.4.6.36 -p 30002 -U user -d postgres

It shows 表明

psql: FATAL: Could not obtain a transaction ID from GTM. psql：致命：无法从GTM获取事务ID。 The GTM might have failed or lost connectivity GTM可能已失败或丢失连接

But I could connect to the coord1 by running 但我可以通过运行连接到coord1

psql  -p 30001 -U user -d postgres

I could ping host2 from my localhost without the password. 我可以在没有密码的情况下从我的localhost ping host2。 I need to resolve the above errors. 我需要解决上述错误。 Any help? 有帮助吗？ Adding the configuraion: 添加配置：

pgxcInstallDir=$HOME/pgxc
pgxcOwner=$USER     
pgxcUser=$pgxcOwner     
tmpDir=/tmp                 
localTmpDir=$tmpDir         
configBackup=n                  
configBackupHost=pgxc-linker    
configBackupDir=$HOME/pgxc      
configBackupFile=pgxc_ctl.bak   
dataDirRoot=$HOME/DATA/pgxl/nodes

#---- Coordinators ----------------------------------------------------------------------------------------------------

coordMasterDir=$dataDirRoot/coord_master
coordSlaveDir=$HOME/coord_slave
coordArchLogDir=$HOME/coord_archlog
coordExtraConfig=coordExtraConfig   
cat > $coordExtraConfig <<EOF
#================================================
# Added to all the coordinator postgresql.conf
# Original: $coordExtraConfig
log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
listen_addresses = '*'
max_pool_size=300
max_connections=200
hot_standby = off
EOF

#---- Datanodes -------------------------------------------------------------------------------------------------------

datanodeMasterDir=$dataDirRoot/dn_master
datanodeSlaveDir=$dataDirRoot/dn_slave
datanodeArchLogDir=$dataDirRoot/datanode_archlog
datanodeExtraConfig=datanodeExtraConfig 
cat > $datanodeExtraConfig <<EOF
#================================================
# Added to all the datanode postgresql.conf
# Original: $datanodeExtraConfig
log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
listen_addresses = '*'
max_pool_size=300
max_connections=200
hot_standby = off
EOF
#---- GTM ------------------------------------------------------------------------------------      
gtmName=gtm
gtmMasterServer=localhost
gtmMasterPort=20001
gtmMasterDir=$dataDirRoot/gtm


coordNames=( coord1 coord2  )
coordMasterServers=( localhost 10.4.6.36  )
coordPorts=( 30001 30002  )
poolerPorts=( 30011 30012  )
coordMasterDirs=( $dataDirRoot/coord_master.1 $dataDirRoot/coord_master.2  )
coordMaxWALSenders=( 5 5  )
coordSlave=n
coordSlaveServers=( none none  )
coordSlavePorts=( none none  )
coordSlavePoolerPorts=( none none  )
coordSlaveDirs=( none none  )
coordArchLogDirs=( none none  )
coordSpecificExtraConfig=( coordExtraConfig coordExtraConfig  )
coordSpecificExtraPgHba=( none none  )


datanodeNames=( dn1 dn2  )
datanodeMasterServers=( localhost 10.4.6.36  )
datanodePorts=( 40001 40002  )
datanodePoolerPorts=( 40011 40012  )
datanodeMasterDirs=( $dataDirRoot/dn_master.1 $dataDirRoot/dn_master.2  )
datanodeMasterWALDirs=( none none  )
datanodeMaxWALSenders=( 5 5  )
datanodeSpecificExtraConfig=( datanodeExtraConfig datanodeExtraConfig  )
datanodeSpecificExtraPgHba=( none none  )

Answer 1

Could you show us your configuration? 你能告诉我们你的配置吗？

What are your max_connections and max_pool_size ? 你的max_connections和max_pool_size什么？ What did the initdb show for your kernel? initdb为你的内核显示了什么？ My guess is that when you add the datanode2 (dn2) you don't have enough connections. 我的猜测是，当你添加datanode2（dn2）时，你没有足够的连接。

You have: 你有：

cluster includes 5 nodes, 1 GTM, 2 Coordinator and 2 Datanodes. 集群包括5个节点，1个GTM，2个协调器和2个Datanode。 The following are the details of nodes. 以下是节点的详细信息。

Postgres-xl specific: max_pool_size=300 max_coordinators=2 max_datanodes=2 Postgres-xl特定： max_pool_size=300 max_coordinators=2 max_datanodes=2

In case of Coordinator (minimal settings): max_connections=100 # number of connections accepted from application(s) max_prepared_transactions = 100 # same as number of connections 在协调器 （最小设置）的情况下： max_connections=100 ＃从应用程序接受的连接数max_prepared_transactions = 100 ＃与连接数相同

In case of Datanode (minimal settings): max_connections=200 # 2 coordinators max_prepared_transactions=2 #Specify at least total number of Coordinators in the cluster. 对于Datanode （最小设置）： max_connections=200 ＃2协调员max_prepared_transactions=2 ＃指定至少集群中协调员的总数。

Excerpt from the Postgres(-xl) documentation 摘自Postgres（-xl）文档

max_connections (integer) max_connections（整数）

Determines the maximum number of concurrent connections to the database server. 确定与数据库服务器的最大并发连接数。 The default is typically 100 connections, but might be less if your kernel settings will not support it (as determined during initdb). 默认值通常为100个连接，但如果内核设置不支持，则可能会更少（在initdb期间确定）。 This parameter can only be set at server start. 此参数只能在服务器启动时设置。

When running a standby server, you must set this parameter to the same or higher value than on the master server. 运行备用服务器时，必须将此参数设置为与主服务器上相同或更高的值。 Otherwise, queries will not be allowed in the standby server. 否则，备用服务器中将不允许查询。

In the case of the Coordinator , this parameter determines how many connections can each Coordinator accept. 对于Coordinator ，此参数确定每个Coordinator接受的连接数。

In the case of the Datanode , number of connection to each Datanode may become as large as max_connections multiplied by the number of Coordinators. 在Datanode的情况下，每个Datanode的连接数可能会变得与max_connections一样大，再乘以协调器的数量。

max_pool_size (integer) max_pool_size（整数）

Specify the maximum connection pool of the Coordinator to Datanodes. 指定协调器到Datanode的最大连接池。 Because each transaction can be involved by all the Datanodes , this parameter should at least be max_connections multiplied by number of Datanodes. 因为所有Datanode都可以涉及每个事务，所以此参数至少应该是max_connections乘以Datanode的数量。

Edit - for update question configuration 编辑 - 用于更新问题配置

Try this: 尝试这个：

Coordinator 协调员
```
 max_connections=100 max_pool_size=300 
```
Datanode (you have 2 datanodes defined) Datanode（您定义了2个数据节点）
```
 max_connections=200 max_pool_size=500 
```

psql：致命：无法从GTM获取事务ID。 GTM可能已失败或丢失连接

问题描述

1 个解决方案

解决方案1
1 2018-02-22 08:53:40

max_connections (integer) max_connections（整数）

max_pool_size (integer) max_pool_size（整数）

psql：致命：无法从GTM获取事务ID。 GTM可能已失败或丢失连接

问题描述

1 个解决方案

解决方案1 1 2018-02-22 08:53:40

max_connections (integer) max_connections（整数）

max_pool_size (integer) max_pool_size（整数）

解决方案1
1 2018-02-22 08:53:40