简体   繁体   English

在多服务器环境中,如果站点的闲置时间超过1500万,则服务器将失去与PostgreSQL数据库的连接

[英]In a multi-server environment, if a site has inactivity for more than 15 mn, the server loses connection to PostgreSQL database

I get the following errors in airbrake if my staging (2 servers) or production (4 servers) servers have no activity for about 15 minutes. 如果我的登台服务器(2台服务器)或生产服务器(4台服务器)在约15分钟内没有活动,则在airbrake中收到以下错误。 Here are the error messages: 这是错误消息:

ActiveRecord::StatementInvalid: PG::Error: could not receive data from server: Connection timed out ActiveRecord :: StatementInvalid:PG ::错误:无法从服务器接收数据:连接超时

OR 要么

PG::Error: could not connect to server: Connection timed out Is the server running on host "tci-db4.dev.prod" and accepting TCP/IP connections on port 5432? PG ::错误:无法连接到服务器:连接超时服务器是否在主机“ tci-db4.dev.prod”上运行并在端口5432上接受TCP / IP连接?

I'm using PostgreSQL as my database. 我正在使用PostgreSQL作为数据库。 One of the servers also acts as the db server. 其中一台服务器还充当db服务器。

Environment: 环境:

Ruby 1.9.3 (This also happened under Ruby 1.8.7, but it is worse since upgrading since the ruby process on the server will go to 100% and stay at 100% until is killed when the server loses the db connection. Ruby 1.9.3 (这在Ruby 1.8.7中也发生过,但是更糟糕的是由于升级,因为服务器上的ruby进程将达到100%并保持100%,直到服务器失去数据库连接时被杀死。

Rails 3.1.6 Rails 3.1.6

PG GEM 0.13.2 PG GEM 0.13.2

Postgres 9.1 Postgres 9.1

Phusion Passenger Phusion乘客

This problem has been happening for over a year, so I'm hoping someone has some insight on how to fix it. 这个问题已经发生了一年多了,所以我希望有人对如何解决它有所了解。 Thanks. 谢谢。

Check your TCP/IP socket timeout settings on all routers/switches between the application servers and the database servers. 在应用程序服务器和数据库服务器之间的所有路由器/交换机上检查TCP / IP套接字超时设置。 Also turn on logging on the database side and watch the full life cycle of the connection and compare the timing to the errors in your application. 另外,打开数据库端的日志记录,观察连接的整个生命周期,并将时间与应用程序中的错误进行比较。 I suggest turning on the following settings in postgresql.conf until you get an idea of what to look for: 我建议您在postgresql.conf中打开以下设置,直到您了解要查找的内容:

log_connections = on
log_disconnections = on
log_statement = all

These can be activated with a SIGHUP of the postgres process (or run "SELECT pg_reload_conf();" as a database superuser. 这些可以通过postgres进程的SIGHUP激活(或以数据库超级用户身份运行“ SELECT pg_reload_conf();”)。

I'll be that you have a "connection closed by remote host" or something similar as the last message before the actual disconnect is logged. 我将是您有一个“由远程主机关闭的连接”或与实际记录断开连接前的最后一条消息类似的东西。

I've seen this before and it was the timeout settings on an intermediate switch causing it. 我以前见过,这是中间交换机上的超时设置导致的。

You probably have a NAT router, connection tracking firewall, or an uppity "layer 3 switch" between the client and the server. 您可能在客户端和服务器之间有一个NAT路由器,连接跟踪防火墙或一个备用的“第3层交换机”。 These devices flush remembered connections from their tables after a timeout. 这些设备在超时后会从其表中清除已记住的连接。 You will need to enable keepalives . 您将需要启用keepalive

Maintaining a lot of keepalived connections from 4 application servers may be quite hard to do (it may represent a very high number of connections. You may check PgPool-II to maitain a reasonnable number of keepalived connections between pgpool and your postgres server. pgPool will also queue connection when too much process ask for a connection. After that check how the connections are managed in your application. Is there a pool of connections managed in the app server? Do you still need it? Do you have a need for long-standing connections or can you simply use short sessions connections? 从4个应用程序服务器维护很多保持连接的连接可能非常困难(这可能表示大量连接。您可以检查PgPool-II来保持pgpool和postgres服务器之间的保持连接的合理数量。pgPool将也可以在太多进程要求连接时对连接进行排队。之后,检查应用程序中的连接管理方式。应用服务器中是否有管理的连接池?您是否仍然需要它?固定连接还是您可以简单地使用短会话连接?

If you still have disconnected sessions between PgPool and your postgreSQl server you will have to check for TCP/IP problems. 如果您仍然断开了PgPool与postgreSQl服务器之间的会话,则必须检查TCP / IP问题。 Such problems can come from the OS TCP/IP settings, but can also be tweaked in postgreSQl configuration. 此类问题可能来自操作系统的TCP / IP设置,但也可以在postgreSQl配置中进行调整。 Check for tcp_keepalive settings on that runtime configuration manual page . 在该运行时配置手册页上检查tcp_keepalive设置 if you use pgpool, check for health_check settings. 如果使用pgpool,请检查health_check设置。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 多服务器Ruby on Rails生产日志记录 - Multi-server Ruby on Rails Production Logging Passenger使用比预期更多的PostgreSQL连接 - Passenger uses more PostgreSQL connection than expected PostgreSQL:无法连接到服务器:连接超时 - PostgreSQL: could not connect to server: Connection timed out Rails - 在文本列中插入超过7786个字符时,获取'Mysql服务器已消失' - Rails - Get 'Mysql server has gone away' when inserting more than 7786 caracters into text column 使用Postgresql数据库执行Rails服务器时出错 - Error on executing rails server with postgresql database 如何在远程服务器导轨上连接Postgresql数据库 - How to connect postgresql database on remote server rails 如何在Rails中与多个数据库并行建立连接? - How to establish_connection with more than one database in parallel in Rails? 如何在多服务器环境中下载生成的.csv文件 - how to download a generated .csv file in a multi server environment Cloud9中的postgresql错误“无法连接到服务器:连接被拒绝” - postgresql error in Cloud9 “could not connect to server: Connection refused” 为 ruby​​ on rails 设置本地环境时,Postgresql 错误连接到数据库失败 - Postgresql error connection to database failed when setting local environment for ruby on rails
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM