简体   繁体   中英

Bots are killing mysql with “too many connections”. It is NOT an increase max connections issue

My website can handle 40,000 people or more simultaneously and run fast but a search engine bot will kill mysql. It's been driving me insane because once the bots come, the site will show "Could not connect: too many connections" and I have to manually restart mysqld to get the website back up. I have been tackling this for a year. I have made so many adjustments to apache and mysql tuning and nothing seems to work. I have changed the max_connections from 300 to 1800 to 10000 and that does not fix the bot problem.

I use Amazon linux and have a huge instance/server. Ram is not an issue. I have done countless tech supports and they find nothing wrong EVER. SO I have to assume it has something to do with my programming. I do not use Wordpress, I built my site from scratch but like I said it can handle 40,000 people no problem. Bots though, crash it.

my connect script is simple:

$connect=mysql_connect("localhost","user","password"); 
if (!$connect)
  {
  die('Could not connect: ' . mysql_error());
  }
mysql_select_db("db",$connect);

The odd thing is, there is always "1" for current connections even if there is 2000 people on the site. So that is why I feel like I'm doing something wrong with connecting to the database.

Does anyone have experience or advice on keeping a site running at all times with heavy bot traffic? PLEASE!!! I repeat, this is not an increase max_connections issue.

MySQL is accepting new connections, but can't handle all the queries. The number of waiting connections will just pile up, until there are to many.

Problem isn't really with MySQL, it's the bots that are misbehaving. You probably don't need all those bots scanning your whole site each time. Luckily you have some control over them.

Step 1: Create a robots.txt and disallow all bots, except the ones you care about.

User-agent: google
Disallow:

User-agent: yahoo
Disallow:

User-agent: msn
Disallow:

User-agent: *
Disallow: /

Step 2: Create a sitemap. Setting the last modified time of each page, means the bots will only hit the changed pages on your site. You can create a sitemap dynamically (querying your DB) using PHP library: thepixeldeveloper/sitemap .

I the example, we're assuming that you have a database with a pages table. The table has a permalink and last_modified column.

// sitemap.php

$urlSet = new Thepixeldeveloper\Sitemap\Urlset(); 

// Adding the URL for '/' to the XML map
$homeUrl = (new Thepixeldeveloper\Sitemap\Url('/'))
  ->setChangeFreq('daily')
  ->setPriority(1.0);

$urlSet->addUrl($homeUrl);

// Add URL of each page to sitemap
$result = mysql_query("SELECT permalink, last_modified FROM pages");

while ($page = mysql_fetch_asoc($result)) {
    $url = (new Thepixeldeveloper\Sitemap\Url($page['permalink']))
      ->setLastMod($page['last_modified'])
      ->setChangeFreq('monthly')
      ->setPriority(0.5);

    $urlSet->addUrl($url);
}

header('Content-Type: text/plain');
echo (new Thepixeldeveloper\Sitemap\Output())->getOutput($sitemapIndex);

You can use a rewrite rule in Apache (or similar in other HTTP server) to rewrite sitemap.xml to sitemap.php .

RewriteEngine On
RewriteRule sitemap.xml sitemap.php [L]

This should be sufficient, though there might be bots that do not respect robots.txt. Detect them and block them (by IP and/or User agent) in your HTTP server configuration


Also consider the following:

Max connections are there, so your server doesn't overload. You should do a benchmark test to determine the maximum number of parallel requests that your application can handle. Than reduce that number by 20% and set that as maximum in both your HTTP webserver and MySQL configuration.

This means your server will give a nice 503 Service Unavailable response before it's overloads. This will make (well behaving) bots give up and try again later, meaning your system will restore without manual attention.

Your script should also exit with the correct HTTP response.

$connect = mysql_connect("localhost", "user", "password"); 
if (!$connect) {
  header("HTTP/1.1 503 Service Unavailable");
  echo 'Could not connect: ' . mysql_error();
  exit();
}
mysql_select_db("db", $connect);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM