简体   繁体   English

如何让Rails在数据库停机后自动重新建立数据库连接

[英]How to get Rails to automatically reestablish database connections after a database downtime

After a database downtime, Rails will first throw this error once: 在数据库停机后,Rails将首先抛出此错误一次:

ActiveRecord::StatementInvalid: NativeException: org.postgresql.util.PSQLException: Connection refused. ActiveRecord :: StatementInvalid:NativeException:org.postgresql.util.PSQLException:连接被拒绝。 Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections. 检查主机名和端口是否正确以及postmaster是否接受TCP / IP连接。

From then on, every database call with have the following error, even after the database is back up: 从那时起,即使在数据库备份之后,每次调用数据库都会出现以下错误:

ActiveRecord::StatementInvalid: ActiveRecord::JDBCError: This connection has been closed. ActiveRecord :: StatementInvalid:ActiveRecord :: JDBCError:此连接已关闭。

To get the server running again, I have to restart the rails server. 为了让服务器再次运行,我必须重新启动rails服务器。 This is not ideal for us, as our prod engineers would like to do maintenance on our databases without having to also bring back up all the services that depend on the database. 这对我们来说并不理想,因为我们的产品工程师希望对我们的数据库进行维护,而不必重新启动依赖于数据库的所有服务。 So, I'm wondering - is there a way to automatically get Rails to try to reestablish the database connection or a recommended way to get this behavior? 所以,我想知道 - 有没有办法自动让Rails尝试重新建立数据库连接或建议的方式来获得这种行为?

Things I have tried: 我尝试过的事情:

I have already tried setting reconnect to true in my database options, and with that, I can kill individual database connections, and rails will reestablish the connections. 我已经尝试在我的数据库选项中将reconnect设置为true,然后我可以杀死单个数据库连接,rails将重新建立连接。 However, it will not after a database outage. 但是,它不会在数据库中断之后。 I found that from a command console I could get the connection back up by calling 我发现从命令控制台我可以通过调用来恢复连接

ActiveRecord::Base::establish_connection 的ActiveRecord :: Base的:: establish_connection

So maybe finding a clean place for rails to call the above command this would work? 所以也许找到一个干净的地方让rails调用上面的命令,这会有用吗? Any suggestions? 有什么建议?

This is a really ugly solution, but I think it should work. 这是一个非常难看的解决方案,但我认为它应该有效。

  • Set reconnect to true, just like you previously did. 将reconnect设置为true,就像之前一样。

  • Edit the file activerecord-XYZ/lib/active_record/connection_adapters/postgresql_adapter.rb and change the reconnect! 编辑文件activerecord-XYZ/lib/active_record/connection_adapters/postgresql_adapter.rb并更改reconnect! method to say 方法说

     def reconnect! clear_cache! ActiveRecord::Base.establish_connection end 

More research is needed 需要更多的研究

  • Check if it actually works 检查它是否真的有效
  • Check if it doesn't call establish_connection several times simultaneously (in which case you'd need a lock) 检查它是否同时多次调用establish_connection(在这种情况下你需要一个锁)
  • Check if there's a better place to put this code in. Ruby lets you redefine any method in runtime, but you need the symbols loaded. 检查是否有更好的地方放置此代码.Ruby允许您在运行时重新定义任何方法,但您需要加载符号。 In other words, you need the PostgreSQLAdapter class to exist. 换句话说,您需要PostgreSQLAdapter类存在。 The closest I've come to having that symbol loaded is in config/environment.rb after initialize! initialize!后,我最接近加载该符号的是config/environment.rb initialize! , but it still wasn't deep enough in the stack to have that symbol loaded. ,但是在堆栈中仍然没有足够的深度来加载该符号。

If you do find a place outside the ActiveRecord code that already has the symbol loaded, and you can edit its methods, then put the following code in it: 如果在ActiveRecord代码之外找到已加载符号的位置,并且可以编辑其方法,则将以下代码放入其中:

class ActiveRecord::ConnectionAdapters::PostgreSQLAdapter::StatementPool
  def reconnect!
    clear_cache!
    ActiveRecord::Base.establish_connection
  end
end

What's more, is that it's a bit overkill to actually call establish_connection . 更重要的是,实际调用establish_connection有点过分。 It might be possible to call the important stuff inside that method, to avoid some overhead. 有可能在该方法中调用重要的东西 ,以避免一些开销。

Let me know if this helped, and if you've made any progress. 如果这有帮助,请告诉我,如果你取得了任何进展。

I had the same issue with Mysql2Adapter. 我和Mysql2Adapter有同样的问题。 I replicated the failure by doing a very long query: User.find_all_by_id((1..1000000).to_a) ; 我通过执行一个非常长的查询来复制失败: User.find_all_by_id((1..1000000).to_a) ; from now on, all ActiveRecord requests fail (User.first fails) 从现在开始,所有ActiveRecord请求都失败(User.first失败)

Here's how I solved it: 这是我解决它的方式:

The issue is very simple: whenever we get the exception above, we want to reestablish the connection and try again. 问题非常简单:每当我们得到上述异常时,我们都希望重新建立连接并重试。 We solve the issue by aliasing the execute method, wrapping it with begin rescue, and reestablishing db connection in rescue. 我们通过别名执行方法,使用begin rescue包装它,并在救援中重新建立数据库连接来解决问题。

For Mysql, the code is in Mysql2Adapter, and below is the fix: 对于Mysql,代码在Mysql2Adapter中,下面是修复:

Place this code in config/initializers/active_record.rb 将此代码放在config / initializers / active_record.rb中

module ActiveRecord
  module ConnectionAdapters
    class Mysql2Adapter < AbstractMysqlAdapter
      alias_method :old_execute, :execute

      def execute(sql, name=nil)
        begin
          old_execute(sql, name)
        rescue ActiveRecord::StatementInvalid
          # you can do some logging here

          ActiveRecord::Base.establish_connection

          old_execute(sql, name)
        end
      end
    end
  end
end

You need to do the same for the postgres adapter. 你需要为postgres适配器做同样的事情。

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/postgresql/database_statements.rb https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/postgresql/database_statements.rb

I think since select is the most used query, you can alias select and as soon as it fails, the connection will be reestablished. 我认为因为select是最常用的查询,所以你可以选择别名,一旦失败,就会重新建立连接。

You want to create an initializer (config/initializers/active_record.rb) and alias select_rows (it might be something else, just find the right method and patch it. It might be async_exec or execute, I haven't looked much into Postgres' adapter) in: 你想创建一个初始化程序(config / initializers / active_record.rb)和别名select_rows(它可能是其他的东西,只需找到正确的方法并对其进行修补。它可能是async_exec或执行,我对Postgres的看法并不多'适配器):

module ConnectionAdapters::PostgreSQLAdapter
      module DatabaseStatements
      end
end

Insert a rescue somewhere 在某处插入救援

rescue ActiveRecord::StatementInvalid: ActiveRecord::JDBCError
  ActiveRecord::Base::establish_connection
  retry

But where? 但是哪里? I do not know 我不知道

Also you can use rescue from in ApplicationController. 您也可以在ApplicationController中使用rescue。 But this will not retry the action that failed, so you should probably also render some error template 但这不会重试失败的操作,因此您可能还应该渲染一些错误模板

rescue_from ActiveRecord::StatementInvalid: ActiveRecord::JDBCError do
  ActiveRecord::Base::establish_connection
  render 'errors/error', :status => 500
end

I'm not sure how to do exactly what you're asking, but I have another 'process' suggestion: set up simple scripts so your prod engineers can easily stop and start all applications. 我不确定如何完全按照你的要求做,但我有另一个'过程'建议:设置简单的脚本,这样你的产品工程师就可以轻松地停止并启动所有应用程序。

Develop a set of capistrano recipes (or other scripts) that your prod engineers can use to stop and start all applications. 开发一套capistrano配方(或其他脚本),您的产品工程师可以使用它来停止和启动所有应用程序。 For a normal Rails app, all you should really need to do is put a maintenance page, so that nginx or apache serves that page instead of forwarding requests to the rails instances. 对于普通的Rails应用程序,您真正需要做的就是放置维护页面,以便nginx或apache服务该页面而不是将请求转发到rails实例。 Ideally then, rails workers stop getting requests, db goes down, db comes up, then the maintenance page gets taken down and the workers get requests again, never realizing the database went away for a while. 理想情况下,rails worker停止获取请求,db关闭,db出现,然后维护页面被取消,工作人员再次获得请求,从未意识到数据库已经离开了一段时间。

In the case of background workers, they may need to be actually stopped and started by the script unless their queue is empty and stays empty. 对于后台工作者,他们可能需要实际停止并由脚本启动,除非他们的队列为空且保持为空。 Any scheduled rake tasks or other scheduled jobs will probably fail if they depend on the database and run while it's down, so you'll want to try to schedule them to run outside the window when you normally do db maintenance. 任何计划的rake任务或其他预定作业如果依赖于数据库并在其停止运行时可能会失败,因此当您正常进行数据库维护时,您将需要尝试安排它们在窗口外运行。

If your prod engineers don't like running scripts (!), you could probably set up a nice web interface to make it easy for them. 如果你的prod工程师不喜欢运行脚本(!),你可能会设置一个漂亮的Web界面来让它们变得简单。 This will probably prove useful for more than just dealing with database connection errors, as it will empower more people in your organization to take care of basic things like stopping and starting your apps. 这可能不仅仅对处理数据库连接错误有用,因为它将使组织中的更多人能够处理停止和启动应用程序等基本操作。

I'd recommend using external tools/scripts to monitor such kind of events. 我建议使用外部工具/脚本来监控此类事件。 Activerecord's reconnect works when databases kills the connection after certain idle time, and it'll give up after certain times of failure. Activerecord的重新连接在数据库在某个空闲时间之后终止连接时起作用,并且在某些失败时间后它将放弃。 So it's not gonna help in your case. 所以它对你的情况没有帮助。

I think you should write your own script to monitor the status of you database. 我认为您应该编写自己的脚本来监视数据库的状态。 If it comes back after some time, simply restart your rails app. 如果它在一段时间后回来,只需重启rails应用程序即可。

Besides, you'll need those monitoring stuff anyway, like your server's memory & cpu & disk usages, server load, database status, bunch of stuffs. 此外,你还需要那些监控内容,比如服务器的内存和CPU和磁盘使用,服务器负载,数据库状态,一堆东西。 Just one more slightly customised monitor rule. 只是一个稍微定制的监控规则。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM