LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 03-11-2011, 06:52 AM   #1
sbauer72
Member
 
Registered: Mar 2011
Posts: 36

Rep: Reputation: 0
Question top command version 3.2.6 invalid results compared to collectD


ALL,

I am trying to get the correct CPU usage using top. I ran collectD and I got different results. This especially happens when the CPU usage is close to 100%.

Has anyone experienced having the top command report different results than other CPU usage utilities?
 
Old 03-11-2011, 07:51 AM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,120

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
All the tools I know use sampling - by definition this is only as accurate as the data collected.
Also helps if you measure the same thing. Have a look at the collectD FAQ for a brief discussion.

Last edited by syg00; 03-11-2011 at 07:53 AM. Reason: removed duplicate post
 
Old 03-14-2011, 03:44 PM   #3
sbauer72
Member
 
Registered: Mar 2011
Posts: 36

Original Poster
Rep: Reputation: 0
I agree that the data sampling must be good. Both collectD and top use the /proc/stat file to calculate its results.

The test between top and collectD were running the same data on the same system. But the results were different. The frequency of every ten seconds was the same too.

The only thing I can think of is that collectD has a daemon collect the data from /proc/stat and sends it to another client program it has. Now top is just one process and it does all of the calculations and display by itself. So it is doing much more work on the fly.

I feel that top has its limitations. It looks like top has a few bugs in it.

The only other difference I could think of is priorities on top vs other daemons or tasks in the system.
 
Old 03-15-2011, 12:42 AM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,120

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
How were you running top - with "-d 10", or "-n 1" in a timed loop ?. Makes a big difference; the manpage warns about using a single invocation.

I looked into this a while back too for similar reasons I suspect. As it happens, I have an old strace of top laying around. It probes /proc/stat at the beginning, processes all the pids (regardless of whether you limit it to specific pid/user), then probes /proc/stat again, presumably to work out the summary numbers (I haven't looked at the code).
This time lag skews all the numbers somewhat.

You call it a bug, maybe it's just working as designed. All code has design decisions, you just have to somehow figure out what they were and how that might affect you.
I haven't tried collectD - thanks for that, I'll give it a try-out. You might also want to have a look at collectl.
 
Old 03-15-2011, 01:34 PM   #5
sbauer72
Member
 
Registered: Mar 2011
Posts: 36

Original Poster
Rep: Reputation: 0
I am running it with the -d 10 option. The exact command is top -b -d 10. So that is what I was entering at the command line.

Did you notice how much of a lag did you have?
 
Old 03-26-2011, 08:05 AM   #6
markseger
Member
 
Registered: Jul 2003
Posts: 244

Rep: Reputation: 26
syg00 - interesting comment about the way top calculates the CPU time - in other words it's actually measuring its own contribution to the load, something I guess I hadn't thought too much about with collectl, but when doing multiple things there's no getting around it. If you report CPU and process metrics you'd have the same result. On the other hand if you ONLY measure CPU with collectl that's all you'll get as it reads /proc/stats and that's all. Reading though all the /proc/pid structures to get the top processes IS very heavyweight and a reason collectl defaults to only reporting these stats once a minute.

The other thing I'd comment on is if you really want to see what the CPU is doing, I'd run collectl with a sampling interval of 1 second or even 0.1 seconds. The system will barely notice the load. And to get even more data use --verbose. In other words:

collectl -sc --verbose [default=-i1]

-mark
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] vgdisplay -v -D results in invalid option -D in rhel 5.5(tikanga) hartajdugal Linux - Newbie 6 07-16-2010 10:57 AM
Swap space used: Why does free and top show different amounts compared to ps doublefrangelico Linux - Newbie 1 06-23-2009 01:50 PM
LXer: Windows vs. Linux Compared With Mixed Results LXer Syndicated Linux News 0 11-14-2007 05:51 AM
What is the free version, compared to the purchasable versions? Konig Mandriva 7 04-21-2004 03:43 PM
Results from top command...user Q...? cbjhawks Linux - Software 1 09-20-2003 10:06 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 03:34 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration