system performance
Hi All,
Need help to trouble shoot system performance 2 servers with the same configurations, One server performs very well cpu usage is below 20% but the second server it goes above 100% Hardware: IBM x3550, 10GB RAM, 2 Quad Core 2.66, 146 15k SAS HDD. RAID 0 OS Centos 5.3 APPS JBOSS 4.2.3 iostat ################### Linux 2.6.18-128.1.10.el5.centos.plus (web3) 07/11/2009 avg-cpu: %user %nice %system %iowait %steal %idle 0.24 0.00 0.01 0.01 0.00 99.73 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 0.37 0.36 9.25 742524 19080318 sda1 0.00 0.00 0.00 2082 46 sda2 0.00 0.00 0.00 1728 0 sda3 0.37 0.36 9.25 738378 19080272 iostat -xd ########################### Linux 2.6.18-128.1.10.el5.centos.plus (web3) 07/11/2009 Dev: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.80 0.01 0.35 0.36 9.25 26.18 0.01 26.12 4.77 0.18 sda1 0.00 0.00 0.00 0.00 0.00 0.00 17.02 0.00 13.12 8.40 0.00 sda2 0.00 0.00 0.00 0.00 0.00 0.00 36.77 0.00 10.68 9.55 0.00 sda3 0.00 0.80 0.01 0.35 0.36 9.25 26.18 0.01 26.12 4.77 0.18 hdparm -tT /dev/sda ############################## /dev/sda: Timing cached reads: 18572 MB in 1.99 seconds = 9310.58 MB/sec HDIO_DRIVE_CMD(null) (wait for flush complete) failed: Inappropriate ioctl for device Timing buffered disk reads: 314 MB in 3.02 seconds = 103.86 MB/sec HDIO_DRIVE_CMD(null) (wait for flush complete) failed: Inappropriate ioctl for device Mount Points ============== /dev/sda1 /boot /dev/sda2 swap /dev/sda3 / Can someone explain me how should I consider avgrq-sz, avgqu-sz, await, svctm, %util to calculate the system performance meaning await min should be this figure for 15k HDD svctm should be this much and so on.. I am really confused reading the documentations Please help //Remy |
That says 0.26% CPU usage. get a snapshot of "w" and "ps -ef" when the system is under load.
|
Quote:
Mean while I captured the output of top for the user running JBOSS Apps with a sleep of 5 sec in a while loop top -b -n 1 <PID_OF_USER_RUNNING_JBOSS> >> /tmp/log.txt & kept monitoring the file as well as the top command output tail -f /tmp/log.txt and in another window monitor the output of top the output of both differ eg: Output1 In the log file it shows 102% CPU top will show 85% CPU Output2 In the log file it shows 45% CPU top will show 72% CPU and so on.. Is it because of the time interval? Can some one explain me? //Remy |
yess in default top updating result every 3 seconds so it will result average of 3 seconds values however you can change this time interval
|
1 Attachment(s)
Quote:
08:42:01 up 24 days, 11:21, 4 users, load average: 0.84, 1.00, 1.15 "USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT" "root pts/1 <MY IP> 04:29 11:07 1.39s 0.05s -bash" "aares00 pts/2 <MY IP> 07:10 1:29m 10.91s 10.90s top" "root pts/3 <MY IP> 05:29 0.00s 0.03s 0.03s -bash" output of ps is attached |
wrap CODE or QUOTE, please use these formatting tools to format your outputs
|
Quote:
|
No, you can edit your post.
Sorry, can you get "ps aux" instead? I'm too used to Unixes where "aux" doesn't work, but that's what we want here. You have two quad cores, so eight cores. Your load average is only around 1; it's not Bad until you get above 8 (1/core), then you're overloaded. |
1 Attachment(s)
Quote:
|
Well, there's nothing using the CPU in that snapshot, and there are not processess in uninterruptible sleep (state 'D'). I don't see a problem in that snapshot.
|
All times are GMT -5. The time now is 02:50 AM. |