Server Performance - Load Avg, Swap, Memory?
At the moment I have web(Apache2), email(Postfix/Dovecot/Procmail), and MySQL all on one server running SuSE 9.3.
It's an OLD desktop box - a gateway with a 350Mhz Pentium II and 128 Mb RAM with a 401Mb swap partition.
After adding some more software (Cacti, MRTG, etc.) I noticed Nagios giving me alerts for load averages every night when I run my backup processes.
During these times, I see peak Load Averages of around 5.5 for the 1 minute interval. I was able to find out a lot about load averages, what they mean, how they are calculated, etc. but I wasn't able to find any information on what the "normal" values are, or what my goal should be.
Also, at the moment with only the normal processes running, I have 68% free on swap, and 120Mb (of 128) RAM used. My current load average is 0.18, 0.37, 0.25... but when I run my nightly backup (tar gz about 1.6 Gb and transfer it over NFS) it jumps to about 5.5 across the board.
I'm planning to upgrade to a newer (used, looking in the range of 2x PIII with 1Gb RAM) server. I'm really interested in, mainly, figuring out hoe my server should (ideally) be running, and using that as a basis for spec'ing out a new server.
Mainly, what I'm really asking is in practice (in the real world), what is a "good" load average (both for normal stress and maximum stress during a nightly backup), % memory usage, and % swap usage. I figure that if I'm using 128Mb of RAM and around 200 Mb of swap, a new server with 1 Gb RAM and the same configuration should be more than adequate...
Also, one other question... is there any way to roughly turn load average into processor speed... i.e. if I know that I have a load average of 5.5 in 1 min on a 350 Mhz, if I double processor speed, how *should* (in the simplest theoretical example) that affect the load average? Cut it in half?
Thanks for any advice...
What is your load average composed of - run queue or I/O wait ???. Likely the latter.
Adding a processor may help - or may not. Adding memory probably will help.
Increase the swap when you upgrade - make it the same as installed memory as a minimum.
Loadavg is a *very* poor metric IMHO - gets too much press, and is too widely misunderstood.
As a general statement Linux has awful performance/tuning metrics.
Sorry for my ignorance on this subject.
During the nightly backup, I would assume it's an I/O problem. But how do I tell?
Is there a better metric to use? Can you point me to a howto/other doc covering this?
Essentially, I want to know 1) how badly my current system is stressed and whether there's anything I can do about it, and 2) get a rough idea of (with the same software, processes, etc.) how much processor speed/memory/swap/etc. the new system will need to have to perform optimally under identical load.
Have a look at the sysstat package - has a nice history collector. You can't tell anything without data.
The current iostat (part of sysstat) will apparently now give numbers for NFS - but you need to be on a (very) current kernel; 2.6.17 or later.
Suse 9.3 probably ain't going to do the job.
What are you using to get the loadavg number - does it also give the header info like "top" ??. The wa% will give you a clue if you are being strangled by I/O. Will also tell you how many running tasks you have - will help decide if another CPU is going to give any benefit.
Perhaps try running top in batch mode, writing to a file evry minute or so. Set up a config file to reverse sort the display by "state". It'll show you running (R) and uninterruptable (D) tasks. The latter are generally waiting for disk I/O.
Then at least you'll have something to work on.
See the manpage for how to set up the config file - it's dead easy.
Try seeing if yast or some repo out there has the sysstat rpm.
This is what you want to use to see when / where your server is having issues.. I/O or PROC and if proc is it user / kernel space.
here is my output from my laptop (while i have a dvd playing and other stuff. i hope the format looks ok when viewing)
Linux 2.6.19-1.2895.fc6 (strange) 02/02/2007
05:57:34 PM LINUX RESTART
06:00:01 PM CPU %user %nice %system %iowait %steal %idle
06:10:02 PM all 44.59 0.01 7.78 11.18 0.00 36.44
06:20:02 PM all 54.71 0.01 8.77 2.56 0.00 33.95
Average: all 49.47 0.01 8.26 7.03 0.00 35.24
if you look through the switches it will even tell you what interup is getting the most cpu time and also give you stats on your nic.. ie errors/packets/ LOTS OF GREAT SYSTEM stuff.. i have it on all my redhat /fedora boxes.. must have tool..
As far as whats the proper load aveage.. Ive only read once in a very good IBM Linux performace book to shoot for a 1.00 average.
Now thats kinda bunk IMHO.. i have a servers running higher average than that and they repspond very well..
PS.. this is my first post as ive just joined 5 min ago..
i forgot to mention that sar by default ( your distro / rpm may be diff) keeps 7 days worth of system stats.. that way you can go back in time and compair. Im sure you can modify that setting.
Depending on the situation, it may not even be worth worrying about your load average in this situation. Considering:
1) You're looking to upgrade the server anyway - Save the time and effort and work out performance issues with the new server, if they still exist.
2) Your high load / low performance is overnight when (I assume) there aren't too many, if any, people using the server.
I was running a server at one stage to handle backups - when all my clients were backing up, I had load averages in the 50's and 60's, into low 70's at one stage I think. (Dual Xeon 3g + RAID 5; my guess a disk I/O bottleneck :p ) But this happened overnight, it wasn't affecting anyone, everything still got done and it was only a temporary setup anyway, so I didn't stress too much.
At least Linux will handle such high loads without dying :)
For me, it is worth figuring out.
The servers I'm looking at are Proliants, older models, with abotu 1 Gb RAM and dual processors at 700Mhz-1.3Ghz. These seem to fit into my budget well, and I have a Proliant of this spec running at a remote site, which I really like.
Essentially, I want to be able to identify system performance issues on the current server as well as my others, know how to do this on the new server, and - most importantly - have some idea of how the new server needs to be spec'ed out in order to handle an equal load well.
Its part art / magic / luck / hard work to get an Exact idea on how to spec out your next system. You first need to know exactly what is causing your bottle necks. Since the mother board / mem / drives bus and rpms / procs are usally better every time you upgrade, its very hard to say just how much better the new system will handle.
if you could do a
and give your output... there will be a lot.
Good luck on the new box.
|All times are GMT -5. The time now is 08:48 AM.|