LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel
User Name
Password
Linux - Kernel This forum is for all discussion relating to the Linux kernel.

Notices

Reply
 
Search this Thread
Old 08-06-2009, 12:05 AM   #1
praveen24
LQ Newbie
 
Registered: Jan 2009
Posts: 13

Rep: Reputation: 0
Under moderate load in a multicore system of 13 processors one processor gets 100% us


Hi
I have a multicore system consisting of 14 processors under heavy load that I simulate by sending packets to the system when I put "top" command I find that randomly any one processor is 0% idle and it is running softirq up to 100% though other all processors are only 97% to 98% idle means only 1% or 2% is used. As far as I know that softirq are reentrant and 2.6 schedular performs good load balance. So I want to know the possible reasons where I can start debugging.My driver is NAPI compliant. So looking for some valuable suggestions as whether driver is faulty or kernel code is?

Last edited by praveen24; 08-09-2009 at 11:55 PM. Reason: Changing no. of cores
 
Old 08-07-2009, 01:14 AM   #2
praveen24
LQ Newbie
 
Registered: Jan 2009
Posts: 13

Original Poster
Rep: Reputation: 0
Probable Reason

As this is a e1000 NAPI complinat driver so it balances the irq and performs the rest of packet receiving job by calling the poll function as poll function dequeues the buffer so no. of interrupts decreases considerably therfore balancing of interrupts does not play a major role.As activation and execution of softirq is performed by the same cpu so any random cpu gets occupancy of 100% which has schedule the softirq. Now the question is Why scheduler is not able to create a load balance / If this intrepretation is fine then how this issue will be rectified.
 
Old 08-07-2009, 05:59 AM   #3
praveen24
LQ Newbie
 
Registered: Jan 2009
Posts: 13

Original Poster
Rep: Reputation: 0
IRQ is balanced

In fact further investigation reavels that irq is balanced on the machine -------
 
Old 08-10-2009, 02:04 AM   #4
praveen24
LQ Newbie
 
Registered: Jan 2009
Posts: 13

Original Poster
Rep: Reputation: 0
Inevitable

As this behaviour is in compliance with NAPI because when this feature is enabled then only one cpu that receives the interrupt at first will do the processing of rx ring buffer so it will mount the load on the same cpu and in the case of high volume traffic it reaches up to 100%.up to this extent it is okay . Can anyone share some design changes in driver so that in the case of one NIC and many more cpu the processing can be disbursed.
 
Old 08-11-2009, 08:04 PM   #5
sundialsvcs
Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 5,382

Rep: Reputation: 1109Reputation: 1109Reputation: 1109Reputation: 1109Reputation: 1109Reputation: 1109Reputation: 1109Reputation: 1109Reputation: 1109
I would be extremely astonished to find that "14 processors or cores" would be saturated by any load-pattern that strictly originated from a physical device (or devices). CPUs can always run much faster than "the real world."

Naturally, you'd like for any one of several CPUs to be able to "take the interrupt," but really that does not matter so much. What really matters here is the handling of the workload that is represented by all those incoming packets. Presumably, each packet is "a request to 'do something,'" and it is the act of "doing something," not the act of handling the I/O traffic, that will make productive and balanced use of a large farm of available CPUs.

Each of the incoming packets should be moved as quickly as possible to a user-land queue which can be serviced by several processes or threads. Each of these threads would dequeue a work-request from the queue and then carry out that request. Each of them would then enqueue a response for later delivery.

Quite naturally, some of the CPUs (i.e. the ones to which the devices are physically attached) will tend to become "I/O-handling specialists." The others will each be running the worker-threads that are handling the actual workload that is, presumably, this busy server's raision d'entre.

Success really has precious-little to do with the kernel. This is a matter of good application design.
 
Old 08-14-2009, 07:33 AM   #6
praveen24
LQ Newbie
 
Registered: Jan 2009
Posts: 13

Original Poster
Rep: Reputation: 0
Napi disabled

Agrred,There might be way like creating some thread with specific affinity to a cpu for solving this issue. But my priority is to distribute the load among several processors by changing something in driver or exploring all possibility by remaining in kernel side.
Now I have disabled NAPI. Now all NIC interrupts are getting to only one cpu at one time until I forced them to move to another cpu by writing in /proc/irq.....
I googled and found very vague answer.There were many suggestions such as:
(1) run irqbalance either deamon or enable CONFIG_IRQBALANCE
In my system I could not find this configuration option in config file(kerneL 2.6.21) and didnot find this process in the the output of ps -ef either.Therefore I am sure it is a part of my kernel.Moreover If it is left then there must be some purpose so I am not going to use it soon.

(2)write to /proc/irq/<no.>/smp_affinity
I found this value already set to ffff even then I changed it to some different value but in vain.still all the interrupts are falling on a single processor .one interesting thing is that when I pull NIC interrupt from one cpu to anothe cpu by changing smp_affinity value to a single bit then it moves corresponding to that cpu but when I write bit pattern having multiple 1's then it crashes.I donot know how by default it is ffff.when I put the same value it crashes.

My question is will irqbalance address this issue?
smp_affinity is by default ffff.Then why it is falling to a single cpu?
 
Old 08-19-2009, 07:27 AM   #7
praveen24
LQ Newbie
 
Registered: Jan 2009
Posts: 13

Original Poster
Rep: Reputation: 0
Hi I hope This time some of the questions will be answered

ENVIRONMENT:
KERNEL 2.6.21
DRIVER E1000
NAPI DISABLED
NO. OF CPU=14
Q.1.Driver code set interrupt affinity to all the cores only when NAPI is enabled?(This is not present in open source code)

Q.2. When NAPI is disabled all the interrupts are falling on only one cpu .why?(/proc/interrupts)

Q.3.If I try to set its affinity in /proc (though default is ffff but cat/proc/interrupts reports that it is only one cpu that is receiving all the interrupts from eth0)it gets panic.
---------------------------------------------------------------
root@:/proc/irq/45> cat smp_affinity
ffff
---------------------------------------------------------------
CPU00 CPU01 CPU02 CPU03 CPU04 CPU05 CPU06 CPU07

45: 0 0 0 808204 0 0 0 0 CIU eth0, eth1

----------------------------------------------------------------
Cpu0 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.3%us, 0.0%sy, 0.0%ni, 81.5%id, 0.0%wa, 5.3%hi, 12.9%si, 0.0%st
Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
 
Old 09-03-2009, 04:21 AM   #8
praveen24
LQ Newbie
 
Registered: Jan 2009
Posts: 13

Original Poster
Rep: Reputation: 0
Throughput degradition

I disabled the NAPI then I found that all the interrupts were getting to one cpu then I put a spinlock in my driver code and comiled it.during execution I set its affinity to multicore by writing in proc file. It worked fine. But outcome was that cpu utilization went up . That is understandable. But throughput of NIC degraded even if I have many cpus processing interrupts.
 
  


Reply

Tags
networking, smp


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Run all OS processes on one core in multicore processor tryon16 Linux - Kernel 6 10-02-2007 05:13 PM
Using Two Different Processors In A Dual Processor System taylor_venable Linux - Hardware 3 02-20-2006 02:10 PM
Annoying System Load (100%) mixtr Slackware 12 08-07-2005 08:04 PM
PPP stalls under moderate CPU load Alucard243 Linux - Networking 2 07-09-2005 02:46 PM
xinetd 100% utilization on one processor on a two processor system red hat 9 garnold Linux - General 0 02-02-2004 01:47 PM


All times are GMT -5. The time now is 07:39 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration