LinuxQuestions.org
Register a domain and help support LQ
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 09-06-2007, 04:24 PM   #1
robeadam
LQ Newbie
 
Registered: Jun 2005
Posts: 5

Rep: Reputation: 0
Slow response


I'm running a test against 2.6.9-42.7.ELsmp. We are considering upgrading from 2.4.21-32.0.1.ELsmp to this version, hince the test. However, when I run the test, the response from the linux box gets really slow, like 1-5 minutes for the response from a command. Just issuing the date command takes 1.5 minutes:

rpd-routem114.cisco.com:27> date
Thu Sep 6 17:21:31 EDT 2007
rpd-routem114.cisco.com:28> date
Thu Sep 6 17:23:01 EDT 2007

I'm ssh'ing through eth0 so I looked there but found no errors:

rpd-routem114.cisco.com:21> sudo ethtool -S eth0
NIC statistics:
rx_packets: 62440
tx_packets: 43762
rx_bytes: 68680156
tx_bytes: 3819299
rx_errors: 0
tx_errors: 0
rx_dropped: 0
tx_dropped: 0
multicast: 0
collisions: 0
rx_length_errors: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
rx_fifo_errors: 0
rx_missed_errors: 0
tx_aborted_errors: 0
tx_carrier_errors: 0
tx_fifo_errors: 0
tx_heartbeat_errors: 0
tx_window_errors: 0
tx_deferred: 0
tx_single_collisions: 0
tx_multi_collisions: 0
tx_flow_control_pause: 0
rx_flow_control_pause: 0
rx_flow_control_unsupported: 0
tx_tco_packets: 0
rx_tco_packets: 0

I also checked the CPU but it looks ok to me:

rpd-routem114.cisco.com:15> iostat
Linux 2.6.9-42.7.ELsmp (rpd-routem114.cisco.com) 09/06/2007

avg-cpu: %user %nice %sys %iowait %idle
7.23 0.00 58.15 0.11 34.50

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 1.71 19.66 14.40 145705 106696

rpd-routem114.cisco.com:26> iostat
Linux 2.6.9-42.7.ELsmp (rpd-routem114.cisco.com) 09/06/2007

avg-cpu: %user %nice %sys %iowait %idle
8.91 0.00 72.20 0.06 18.83

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 1.19 10.73 11.79 145753 160112


The same goes for memory:

rpd-routem114.cisco.com:7> free -m
total used free shared buffers cached
Mem: 1001 180 820 0 30 44
-/+ buffers/cache: 106 895
Swap: 4094 0 4094


Can anyone suggest anything else to look at to see why the box is responding so slow?

Thanks,
Robert
 
Old 09-06-2007, 04:54 PM   #2
robeadam
LQ Newbie
 
Registered: Jun 2005
Posts: 5

Original Poster
Rep: Reputation: 0
Slow response - additional information

There is a tool we use to emulate BGP peer's. I noticed when the tool was running is when the response was really slow. When I stopped the tool, the response time returned to normal. What's strange to me is that there doesn't seem to be much additional CPU free as compared to before, but the response has greatly improved.

With tool running

avg-cpu: %user %nice %sys %iowait %idle
7.23 0.00 58.15 0.11 34.50

avg-cpu: %user %nice %sys %iowait %idle
8.91 0.00 72.20 0.06 18.83

avg-cpu: %user %nice %sys %iowait %idle
9.03 0.00 73.18 0.06 17.73

With tool stopped:

avg-cpu: %user %nice %sys %iowait %idle
8.72 0.00 70.73 0.05 20.49

avg-cpu: %user %nice %sys %iowait %idle
8.72 0.00 70.68 0.05 20.55

avg-cpu: %user %nice %sys %iowait %idle
8.66 0.00 70.21 0.05 21.07

During a test on the 2.4 kernal, the CPU has plenty free:

avg-cpu: %user %nice %sys %iowait %idle
0.10 0.04 0.24 0.01 99.60

Robert
 
Old 09-06-2007, 05:43 PM   #3
ilikejam
Senior Member
 
Registered: Aug 2003
Location: Glasgow
Distribution: Fedora / Solaris
Posts: 3,109

Rep: Reputation: 96
Hi.

What does 'top' look like when it's behaving like this? Also, are there any more recent kernels you could try?

Dave
 
Old 09-06-2007, 07:46 PM   #4
robeadam
LQ Newbie
 
Registered: Jun 2005
Posts: 5

Original Poster
Rep: Reputation: 0
Hey Dave,

Thanks for the response.

When I ran top earlier, it showed the idle CPU at 0%

top - 17:31:42 up 3:58, 1 user, load average: 102.39, 101.49, 101.00
Tasks: 264 total, 74 running, 190 sleeping, 0 stopped, 0 zombie
Cpu(s): 10.9% us, 88.7% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.4% hi, 0.0% si
Mem: 1025712k total, 198840k used, 826872k free, 41260k buffers
Swap: 4192924k total, 0k used, 4192924k free, 46620k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9656 root 15 0 190m 20m 460 S 4 2.1 2:08.26 routem.latest
9665 root 15 0 190m 20m 460 S 4 2.1 2:07.64 routem.latest
9641 root 15 0 190m 20m 460 R 4 2.1 2:07.39 routem.latest
...

"routem" is the tool we use to emulate the BGP peers. I'm not sure why iostat would report idle CPU yet top show none. However, since the 1, 5 & 15 minute averages are all over 100% I can't really contribute the difference between iostat and top as CPU required to run top.

We run RedHat Enterprise here, and I think the latest deployed in our kickstart process is v.4 update 4. :-(.

Robert
 
Old 09-06-2007, 08:11 PM   #5
ilikejam
Senior Member
 
Registered: Aug 2003
Location: Glasgow
Distribution: Fedora / Solaris
Posts: 3,109

Rep: Reputation: 96
Those load averages aren't in %, they're in no. of processes on average trying to run. A load average of 1 represents a fully occupied single core machine. If you machine is dual core, then a load average of 2 represents a perfectly loaded host. So if you're running a dual processor host, then you load average is actually at 5000%.

Looks like the kernel is really chewing on something ( 88.7% sy ).

To be honest, the upgrade from 2.4 to 2.6 kernels isn't trivial - is the rest of the software on the host updated as well? There's numerous changes to binutils and others required for the 2.6 kernel, if memory serves.

Dave
 
Old 09-13-2007, 09:26 AM   #6
robeadam
LQ Newbie
 
Registered: Jun 2005
Posts: 5

Original Poster
Rep: Reputation: 0
It turns out that the problem was with the BGP emulator. It loops through all the peers it is emulating then sleeps 10000 usec. It appears in the 2.4 kernel, this was ok but in the 2.6 kernel, it causes problems. Increasing that time to 50000 usec helped quite a bit but still doesn't completely resolve the problem.

Thanks for the support!

Robert
 
Old 09-13-2007, 09:31 AM   #7
ilikejam
Senior Member
 
Registered: Aug 2003
Location: Glasgow
Distribution: Fedora / Solaris
Posts: 3,109

Rep: Reputation: 96
I see. The kernel tick interval was changed between 2.4 and 2.6, so maybe that's what's causing the problem.

The old timer was 100Hz, but on 2.6 you can choose between 100, 300 and 1000Hz.

Dave
 
  


Reply

Tags
cpu, memory, performance


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
squid slow response ilnli Linux - Software 10 10-16-2006 11:56 PM
[SOLVED] Q3: Slow network response mattz40 Debian 9 04-05-2005 09:07 AM
Slow konsole response taarnak Linux - General 2 12-16-2003 05:43 PM
Slow response as ..... umok Linux - Newbie 4 07-11-2003 12:54 PM
Slow response on Mandrake 8.0 Droopy Linux - Networking 0 09-07-2001 09:16 PM


All times are GMT -5. The time now is 07:27 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration