I run many php-cli scripts via crontab on an Ubuntu server running within VMWare VSphere. The PHP scripts are memory hogs (fixing that simultaneously) but it seems they should have the resources needed on this VM. The load average is very high in the 100+ range with a high-performance 8core and 120G ram box. I'm puzzled why the load is high when I see:
Environment info:
#uname -a
Linux tasks 3.0.0-2-amd64 #1 SMP Fri Oct 7 20:48:45 UTC 2011 x86_64 GNU/Linux
The following items have been adjusted in sysctl:
#head /etc/sysctl.conf
fs.file-max = 2097152
vm.swappiness = 10
vm.dirty_ratio = 60
vm.dirty_background_ratio = 2
top
top - 10:51:27 up 219 days, 21:50, 3 users, load average: 190.18, 171.37, 152.70
Tasks: 400 total, 179 running, 220 sleeping, 0 stopped, 1 zombie
%Cpu(s): 11.4 us, 1.7 sy, 0.0 ni, 86.2 id, 0.4 wa, 0.0 hi, 0.3 si, 0.0 st
Mb Mem: 121121 total, 51993 used, 69128 free, 17 buffers
Mb Swap: 6257 total, 0 used, 6257 free, 532 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10551 root 20 0 557m 282m 11m R 1.7 0.2 0:05.83 php
6204 root 20 0 555m 286m 10m R 1.0 0.2 0:07.41 php
16516 root 20 0 408m 140m 9744 R 1.0 0.1 0:03.34 php
24167 root 20 0 309m 41m 9784 R 1.0 0.0 0:00.63 php
45041 root 20 0 1894m 1.6g 10m R 1.0 1.3 7:27.72 php
599 root 20 0 521m 254m 10m R 0.7 0.2 0:09.26 php
1101 root 20 0 357m 89m 9796 R 0.7 0.1 3:46.28 php
3273 root 20 0 3342m 3.0g 9756 R 0.7 2.5 3:50.18 php
3958 root 20 0 536m 268m 10m R 0.7 0.2 0:08.28 php
4798 root 20 0 780m 508m 9756 R 0.7 0.4 0:08.26 php
5464 root 20 0 532m 256m 10m R 0.7 0.2 0:08.03 php
5905 root 20 0 536m 268m 10m R 0.7 0.2 0:07.42 php
6913 root 20 0 557m 288m 10m R 0.7 0.2 0:06.89 php
7028 root 20 0 2147m 1.8g 9792 R 0.7 1.6 0:32.89 php
8535 root 20 0 431m 156m 10m R 0.7 0.1 0:06.77 php
8745 root 20 0 2836m 2.5g 10m R 0.7 2.1 4:46.24 php
9224 root 20 0 538m 269m 10m R 0.7 0.2 0:06.36 php
10665 root 20 0 745m 473m 9752 R 0.7 0.4 0:05.96 php
12313 root 20 0 760m 490m 9752 R 0.7 0.4 0:05.15 php
12340 root 20 0 944m 675m 9752 R 0.7 0.6 0:05.15 php
vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
191 1 0 70536200 18216 546040 0 0 0 32 3007 2381 13 2 85 0
187 0 0 70567328 18216 546068 0 0 0 4 2840 2468 12 1 86 0
184 0 0 70650144 18216 546096 0 0 0 0 3802 2655 10 2 88 0
186 0 0 70642768 18216 546120 0 0 0 0 4456 2431 13 1 86 0
186 0 0 70630560 18216 546144 0 0 0 0 4936 2629 15 2 83 0
185 1 0 70620504 18224 546152 0 0 0 32 4584 2459 12 2 86 0
183 0 0 70611000 18224 546192 0 0 0 4 3820 2827 9 2 89 0
190 1 0 70643592 18224 546260 0 0 0 0 4093 3350 12 3 84 1
191 0 0 71065760 18224 546304 0 0 0 0 3745 2503 12 3 84 0
191 4 0 71041560 18224 546332 0 0 0 0 3314 2798 13 2 85 0
187 0 0 71028392 18224 546332 0 0 0 0 3280 3140 12 2 86 0
195 0 0 71015808 18236 546360 0 0 4 240 3164 2945 14 2 84 0
196 0 0 71002112 18236 546388 0 0 0 0 3136 3004 9 2 89 0
194 0 0 70999600 18236 546416 0 0 0 0 3576 3348 14 2 83 0
187 1 0 70994792 18236 546436 0 0 0 0 3362 3193 13 2 85 0
188 0 0 70979392 18236 546448 0 0 0 0 2870 3054 10 2 88 0
What other tools or settings I should be reviewing?
UPDATE Running htop I can see a single core is handling all PHP processes. Is there perhaps a setting on the VM or OS which would control this?
Your load average is exactly as expected -
You have 100+ processes running. They're all running at the same time. Therefore your load average should be 100+.
It's a very rough indicator of 'how much stuff is going on right now on the machine' - and the answer is - a lot! You have a 100 processes running all at the same time right now.
When your processes are waiting for I/O requests to complete, they're considered 'not running' - so your load average would be lower.
It looks to me like everything is working as expected! Except for that thing where they're all running on the same CPU.
But if they were all running on different CPU's, your load average would be the same. Your CPU usage (in aggregate) would be higher.
Now - if the various processes that you're running are taking too long - that's a different thing. But, again, your load average would still be high.
As for another troubleshooting tool - in top you can type '1' and it should show you a breakout of all of your CPU's.
And "iostat" is a good tool for seeing if you are I/O-constrained. Which I doubt you are; since your load average is so high (you'd see a higher percentage in "wa" which means "waiting"). If you try something like "iostat 5" you will get a refresh of I/O usage every 5 seconds, for example. If you see one of your disks getting slammed, that would be something you could try to fix, either in code, or with faster disks, or RAID, or caching, or something like that.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.