Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Hi,
I need help about the strange output of the top command
in my debian server. We see high load average with relatively
low cpu usage. Also iowait seems normal.
I think this is something about Java socket threading but
i don't know how to discover and fix exactly what is causing
the issue.
We are running a Java socket server on this Debian machine.
It's a Dual-Core AMD Opteron(tm) Processor 1210 processor
with 8GB RAM.
and java -version output:
java version "1.5.0_16"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_16-b02)
Java HotSpot(TM) Server VM (build 1.5.0_16-b02, mixed mode)
I don't think there's too much to worry about, there seems to be quite a few idle processes so you could possibly tune things a little. Maybe httpd is configured to spawn lots of children ... ?
High bandwith usage alone won't produce great processor power.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
update:it means that just dl/uploading without use of hard drive wont use much of cpu
example:router can take a lot of traffic on its slow processor
When we shut down the java socket server, load average decreases to 2.0-3.0
so i thought it's about java. also we have another server (higher traffic)
with same apache configuration and it seems fine.
O.K., I'm confused - that should list all uninteruptible sleep tasks (which contribute to loadavg). I had a look at how loadavg is accumulated a while back - seemed straightforward. What kernel are you on ("uname -a") ?.
That's one of the major problems with a tool like top - no history. You see what you see and that's it. PLUS you only see what top wants you to see.
You could always try out collect, either interactively or as a daemon. Dy default in daemon mode it samples everything but processes every 10 seconds and processes every 60 - extra overhead.
BUT if you really want to see what's happening over a relatively short period of time edit /etc/collectl.conf and add "-i1:1" to the line 'DaemonCommands' and that will monitor everything once a second.
"service collectl start" and let it run for a few minutes and then "service collectl stop". now play back the data it collected - too many options to list - but if you run:
collectl -p /var/log/collectl/filename -sxxx -oT
you'll see data for the subsystems specified with 'xxx' along with time stamps. 'c' will show CPU, 'd' disk, etc. "collectl --showsbsys" for a complete listing.
if you want to look at your top processes over time, which is what got me started, you can:
collectl -p filename --top
and you'll see the top 10 processes for every second!!! if you want to see more or less of them "collectl -x" and see the options for --top.
if you "collectl -p filename -sc --verbose -oT" you'll see the load averages along with the number of running processes AND the number of process creations/sec if that is a concern.
hey back - yes I know you're a fan. I've seen previous posts by you recommending it. I do realize not everyone is on board with it but I also realize not everybody is convinced monitoring is important. I was talking to someone the other day who was a sar user. Nothing wrong with sar, just that people use a monitoring interval that's much too high. I suggested if that they at least drop the monitoring frequency down to 10 seconds as 10 minutes is pretty worthless. They said their vendor told them not to go below a minute and I told them their vendor is wrong! If collectl generates less that 0.1% cpu load running at 10 second monitoring and it's written in perl, SAR had got to have a lighter footprint. But some people just don't get it.
I would wonder why people don't use collectl:
- they don't believe in proactive monitoring
- they're happy with what the have
- they're scared of it
If the first, they're flat out wrong. If the second, that's fine as long as they monitor frequently. If the third I can help if they ask.
I believe EVERYONE should continuously monitor their systems at 5-15 second frequencies. There are a very few situations where I've seen monitoring have on impact on performance - applications that run at 100% cpuloads and are fine-grained parallel jobs running on 1000 cores or more. If you don't know what a find-grained parallel job is, you don't have to worry about collectl! First of all not many people run parallel jobs, let along fine-grained ones, and even less run on 1K cores or more. Even those who do run on that many cores still find a slight performance hit is worth it be able to have the data available if something goes wrong.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.