Rails 4 database connection pool error

Question

I have a rails app hosted with NGINX and Puma. Every 10 hours or so, the app becomes unusable. Whenever a user tries to connect, the following error message is displayed:

Error during failsafe response: could not obtain a database connection within 5.000 seconds (waited 5.000 seconds)

This continues until the app is restarted.

I have read that this is because the database connection pool is full, and so there must be threads being created in the rails app that are not closing their connection to the database when they finish. To my knowledge, there is only one place in the app code where threads are used: one block uses the Ruby Timeout module, but this does not access the database.

Following this guide https://devcenter.heroku.com/articles/concurrency-and-database-connections (I am not actually using Heroku) I have set the size of the database connection pool to 5, with the following config file :

#config/initializers/database_connection.rb
Rails.application.config.after_initialize do
  ActiveRecord::Base.connection_pool.disconnect!

  ActiveSupport.on_load(:active_record) do
    config = ActiveRecord::Base.configurations[Rails.env] ||
                Rails.application.config.database_configuration[Rails.env]
    config['reaping_frequency'] = ENV['DB_REAP_FREQ'] || 10 # seconds
    config['pool']              = ENV['MAX_THREADS'] || 5 
    ActiveRecord::Base.establish_connection(config)
  end

end

The site is hosted using Rails 4.0.0. I have read that this may in fact be a Rails 4.0.0 problem instead, and that this was fixed in later versions, but am unsure of this. ConnectionTimeoutError on Heroku with Postgres

Is there any way to monitor the number of active database connections in the connection pool? This would make debugging much easier.
Is using the Timeout module within Rails app code likely to the cause of this problem?
Is this likely to be a Rails 4.0.0 problem rather than a problem with my app?

The rails app is running in the production environment. I can give more information on my Puma, NGINX config if needed.

Answer 1

The fact that the failsafe response is trying to allocate a database connection may be a smoking gun. It might help you could describe what happens in the failsafe response. The failsafe response was presumably triggered when the original request triggered an exception. The rails show_exception routine which calls the failsafe response is called after the ConnectionManager calls clear_active_connections! for the current request (which failed with an exception), which means that rails will not automatically release database connections after the failsafe response fails. This means that the failsafe response handler is responsible for cleaning up its own database connections. I'm not sure it's good practice for the failsafe response handler to be trying to connect to the database, but if that is the desired behavior, then you may have to call clear_active_connections! explicitly at the end of your failsafe handler (in an ensure block).

I've been investigating a similar problem in my own app and found this to be a useful guide to how connections work: https://bibwild.wordpress.com/2014/07/17/activerecord-concurrency-in-rails4-avoid-leaked-connections/ . While the code referenced in here may need a few tweaks, there's a good outline in there of how to go about detecting when you create an implicit database connection.

Answer 2

I don't think that it's a rails 4.0.0 problem.

So as mentioned in ruby timeout module documentation it's spawn a new thread. I think there is a chance that it can spawn long running thread witch keep db connection a live. To check running threads you can use Thread.list method. Also keep in mind that your pool size must be >= than puma threads multiply puma workers.

Rails 4 database connection pool error

Question

2 answers

solution1
2 2014-11-22 01:04:53

solution2
0 2014-11-02 20:00:26

Rails 4 database connection pool error

Question

2 answers

solution1 2 2014-11-22 01:04:53

solution2 0 2014-11-02 20:00:26

solution1
2 2014-11-22 01:04:53

solution2
0 2014-11-02 20:00:26