简体   繁体   中英

Ubuntu 12.04 High CPU usage on Amazon EC2 Small instance

I am running a small instance on Amazon EC2 with Ubuntu 12.04 LTS. I have also setup Cloudwatch Alarm on the instance.

The problem is that CPU utilization goes above 90% sometimes and I get notification alarm for that. I have set a cronjob on instance which runs at every minute and stores the top 3 running processes based on highest CPU usage in a log file. Cronjob is as below.

* * * * * ps -eo pcpu,pid,args --no-headers | sort -n -r | head -3 | perl -pe 'print scalar(localtime()), " ";' >> ps_log/log

But I can't see any process with high CPU usage when I run this command on cron log.

cat ps_log/log | sort -k 6 -n -r | head -10

Following is the latest result of cron log.

Tue May 13 17:44:01 2014 17.1 10171 /opt/lampp/bin/httpd -k start -DSSL -DPHP5 -E /opt/lampp/logs/error_log
Tue May 13 18:06:01 2014 15.1 10502 /opt/lampp/bin/httpd -k start -DSSL -DPHP5 -E /opt/lampp/logs/error_log
Tue May 13 13:28:01 2014 14.7  6526 /opt/lampp/bin/httpd -k start -DSSL -DPHP5 -E /opt/lampp/logs/error_log
Tue May 13 09:56:01 2014 12.4  3277 /opt/lampp/bin/httpd -k start -DSSL -DPHP5 -E /opt/lampp/logs/error_log
Tue May 13 18:06:01 2014 11.4 10508 /opt/lampp/bin/httpd -k start -DSSL -DPHP5 -E /opt/lampp/logs/error_log
Wed May 14 02:32:36 2014 11.0 16936 ps -eo pcpu,pid,args --no-headers
Tue May 13 13:32:01 2014 10.3  6619 /opt/lampp/bin/httpd -k start -DSSL -DPHP5 -E /opt/lampp/logs/error_log
Tue May 13 18:06:01 2014 10.2 10501 /opt/lampp/bin/httpd -k start -DSSL -DPHP5 -E /opt/lampp/logs/error_log
Tue May 13 11:08:01 2014  9.6  4802 /opt/lampp/bin/httpd -k start -DSSL -DPHP5 -E /opt/lampp/logs/error_log
Wed May 14 02:58:07 2014  8.5 17268 ps -eo pcpu,pid,args --no-headers

I can show the two Alarm results and the result of cronlog at which the Alarm notification came.

  • Reason for State Change: Threshold Crossed: 1 datapoint (96.72) was greater than or equal to the threshold (80.0).
  • Timestamp: Tuesday 13 May, 2014 15:42:09 UTC

Cronlog:

Tue May 13 15:39:20 2014  2.0  8481 perl -pe print scalar(localtime()), " ";
Tue May 13 15:39:20 2014  1.6  8478 ps -eo pcpu,pid,args --no-headers
Tue May 13 15:39:20 2014  1.2  8004 /opt/lampp/bin/httpd -k start -DSSL -DPHP5 -E /opt/lampp/logs/error_log
Tue May 13 15:41:08 2014  1.7  8527 /opt/lampp/sbin/mysqld --basedir=/opt/lampp --datadir=/opt/lampp/var/mysql --plugin-dir=/opt/lampp/lib/mysql/plugin --user=nobody --log-error=/opt/lampp/var/mysql/ip-10-178-52-49.err --pid-file=/opt/lampp/var/mysql/ip-10-178-52-49.pid --socket=/opt/lampp/var/mysql/mysql.sock --port=3306
Tue May 13 15:41:08 2014  1.5  8547 ps -eo pcpu,pid,args --no-headers
Tue May 13 15:41:08 2014  0.9  8003 [httpd] <defunct>
Tue May 13 15:43:01 2014  6.0  8578 sort -n -r
Tue May 13 15:43:15 2014  5.0  8577 ps -eo pcpu,pid,args --no-headers
Tue May 13 15:43:24 2014  3.3  8579 head -3
Tue May 13 15:44:21 2014  1.2  8527 /opt/lampp/sbin/mysqld --basedir=/opt/lampp --datadir=/opt/lampp/var/mysql --plugin-dir=/opt/lampp/lib/mysql/plugin --user=nobody --log-error=/opt/lampp/var/mysql/ip-10-178-52-49.err --pid-file=/opt/lampp/var/mysql/ip-10-178-52-49.pid --socket=/opt/lampp/var/mysql/mysql.sock --port=3306
  • Reason for State Change: Threshold Crossed: 1 datapoint (96.72) was greater than or equal to the threshold (80.0).
  • Timestamp: Tuesday 13 May, 2014 15:42:09 UTC

Cronlog:

Tue May 13 15:39:20 2014  2.0  8481 perl -pe print scalar(localtime()), " ";
Tue May 13 15:39:20 2014  1.6  8478 ps -eo pcpu,pid,args --no-headers
Tue May 13 15:39:20 2014  1.2  8004 /opt/lampp/bin/httpd -k start -DSSL -DPHP5 -E /opt/lampp/logs/error_log
Tue May 13 15:41:08 2014  1.7  8527 /opt/lampp/sbin/mysqld --basedir=/opt/lampp --datadir=/opt/lampp/var/mysql --plugin-dir=/opt/lampp/lib/mysql/plugin --user=nobody --log-error=/opt/lampp/var/mysql/ip-10-178-52-49.err --pid-file=/opt/lampp/var/mysql/ip-10-178-52-49.pid --socket=/opt/lampp/var/mysql/mysql.sock --port=3306
Tue May 13 15:41:08 2014  1.5  8547 ps -eo pcpu,pid,args --no-headers
Tue May 13 15:41:08 2014  0.9  8003 [httpd] <defunct>
Tue May 13 15:43:01 2014  6.0  8578 sort -n -r
Tue May 13 15:43:15 2014  5.0  8577 ps -eo pcpu,pid,args --no-headers
Tue May 13 15:43:24 2014  3.3  8579 head -3
Tue May 13 15:44:21 2014  1.2  8527 /opt/lampp/sbin/mysqld --basedir=/opt/lampp --datadir=/opt/lampp/var/mysql --plugin-dir=/opt/lampp/lib/mysql/plugin --user=nobody --log-error=/opt/lampp/var/mysql/ip-10-178-52-49.err --pid-file=/opt/lampp/var/mysql/ip-10-178-52-49.pid --socket=/opt/lampp/var/mysql/mysql.sock --port=3306
Tue May 13 15:44:21 2014  0.7  8569 CRON
Tue May 13 15:44:21 2014  0.7  8501 /opt/lampp/bin/httpd -k start -DSSL -DPHP5 -E /opt/lampp/logs/error_log
Tue May 13 15:44:21 2014  1.2  8527 /opt/lampp/sbin/mysqld --basedir=/opt/lampp --datadir=/opt/lampp/var/mysql --plugin-dir=/opt/lampp/lib/mysql/plugin --user=nobody --log-error=/opt/lampp/var/mysql/ip-10-178-52-49.err --pid-file=/opt/lampp/var/mysql/ip-10-178-52-49.pid --socket=/opt/lampp/var/mysql/mysql.sock --port=3306
Tue May 13 15:44:21 2014  0.7  8569 CRON

Now is there any way I can catch the process with high CPU usage? A website is hosted on the instance which has very low traffic. Any help would be appreciated.

A couple of things:

  1. You may see a disconnect between reported cpu usage on linux in a virtual machine and what Amazon reports as the real cpu usage. Note that the latter is correct. Cpu usage stats monitoring via ps and top are unreliable, a good explanation resides here:

    http://www.axibase.com/cloud/2010/07/22/ec2-monitoring-the-case-of-stolen-cpu/

  2. Regardless of the accuracy of the top and ps commands, something is causing cpu to spike. Ps and top should at least tell you which processes are using the most. Instead of calling ps once a minute which may miss the offending process, why not run it in a loop from a bash script with a short loop time (like every 10 seconds)? Redirect it to a log file and and you should be able to find a ps or top entry within a few seconds of the alarm. Something like this:

     while : do date echo ps -eo pcpu,pid,args --no-headers echo top -c -b -n 1 echo sleep 10 done 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM