简体   繁体   中英

MariaDB THREADS_CONNECTED build up with php-fpm static

The Problem

  • My app is written in PHP, on top of Laravel.
  • Every hour, I have to restart php-fpm to prevent the MariaDB database from hitting max_connections = 150 and disabling the app since no more connections can be created.

Diagnostic Information

  • PHP-FPM is configured as static , with a max child count of 39 .
  • DB connections are not configured as persistent.
  • Raising max_connections above 150 only delays the issue.
  • There are three DB nodes and three app nodes. The app nodes only talk to their partner DB node in the same region.
  • The DB nodes are replicating to each other via Galera.
  • The DB nodes have too many connections independently, not as a cluster.
  • Checking show full processlist shows me that the vast majority of connections are in SLEEP state and doing nothing.
  • Using the remote port from the processlist and ss on the app node as well as the php-fpm status page, I've determined that the children holding the connections open are themselves in idle state.

Attempted solutions

  • I've switched php-fpm to dynamic and set the idle-timout to 10s. The children do not quit, and I can't see any errors.
  • I've turned down the number of requests a php-child can handle before it is reaped from 100 to 1 with no effect.
  • I've registered a shutdown handler with PHP that checks if my DB connection is open and produces an alert. No alerts have been sent.
  • I've set up a cronjob to systemctl restart php7.4-fpm every hour. This alleviates the issue, but obviously isn't a good solution.

Questions

  • Under what circumstance does php-fpm maintain a DB connection beyond the end of a script or request?
  • How do I stop it from doing that?

Thanks for reading and any idea that might help.

"I've registered a shutdown handler with PHP that checks if my DB connection is open and produces an alert. No alerts have been sent."

Regardless of whether you're getting the alert, are you certain that this is running explicitly?

DB::disconnect('yourdatabase');

If you think you've got that covered, looking at the PDO object that Laravel is using internally may help. Instead of just debugging via an alert, which may be getting lost somewhere, logging a positive assertion that the connection is closed could be helpful.

$pdo = DB::connection()->getPdo();
// My "positive assertion" comment is because this can be deceiving:
alertThatDoesNotWork();

// Instead, with odd problems like this, I prefer to do
logStateOfDatabaseConnection();

Without more information, my best guess is related to "not guilty" !== "innocent" . The log-regardless-of-state approach reduces the outputs to "innocent" and "guilty" with no ambiguity.

2022-02-09 Edit: Have you checked for a persistence setting? I think this may conventionally be in a database.php or somesuch:

  'options' => [
    \PDO::ATTR_PERSISTENT => true
  ]

If you can find a use of "PTO::ATTR_PERSISTENT" in setting up the database, then that may be the culprit. If you're able to test after switching it off, it may at least clearly define the problem at the cost of some latency.

It's still a bug, but potentially one you can't do much about if the PDO object's destructor is never running. From the manual:

The connection remains active for the lifetime of that PDO object. To close the connection, you need to destroy the object by ensuring that all remaining references to it are deleted--you do this by assigning NULL to the variable that holds the object. If you don't do this explicitly, PHP will automatically close the connection when your script ends.

Perhaps the script is ending without destroying your PDO object, and the harder-stop of the process ending (and thus freeing the db resource) isn't happening because the fpm process is persistent. A hail-Mary attempt would be throwing a gc_collect_cycles(); into your shutdown function with the hope that it works around a resource leak. If there are still references to the PDO object that Laravel hasn't cleaned up, then to some extent, it's their bug. If you can hunt down references to the object, including those in your debug code, and destroy them, perhaps you can drive the ref-count to 0 so that it gets cleaned up properly. I'm not saying that we should all go back to malloc() and free(), but sometimes this "convenience features" aren't so convenient. :-/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM