LinuxQuestions.org
Did you know LQ has a Linux Hardware Compatibility List?
Go Back   LinuxQuestions.org > Forums > Enterprise Linux Forums > Linux - Enterprise
User Name
Password
Linux - Enterprise This forum is for all items relating to using Linux in the Enterprise.

Notices

Reply
 
Search this Thread
Old 12-28-2005, 07:55 AM   #1
live_dont_exist
Member
 
Registered: Aug 2004
Location: India
Distribution: Redhat 9.0,FC3,FC5,FC10
Posts: 257

Rep: Reputation: 30
IRQ balancing


Hey All,

Ive got an Enterprise Linux 4 system here and we are trying to tweak it to optimize performance forour Internet Monitoring system.Now we found out that the IRQ's were being distributed a bit unevenly in /proc/interrupts.So I decided to try and evn things out a bit :-

For eg.
Interrupt 0 was for all CPU's ..took it off and did an echo 1 > smp_affinity
..
Similarly I repeated teh procedure for various interrupts.The process seemed to work well enough until I rechecked everything again.I found that the entries had automatically changed to 4 from 1 or 1 to 2 and so on.Howz this happening? Is it because of the irqbalance process and is it safe to turn it off completely?

Please advise...All info is appreciated...Im looking into the Kernel documentation for guidelines so it should be correct stuff Im following..

The kernel is a 2.6.9.5 , 4 processor , 3 NIC's(1 disabled) EL4 system...

Thnx
Arvind
 
Old 12-28-2005, 01:44 PM   #2
stress_junkie
Senior Member
 
Registered: Dec 2005
Location: Massachusetts, USA
Distribution: Ubuntu 10.04 and CentOS 5.5
Posts: 3,873

Rep: Reputation: 331Reputation: 331Reputation: 331Reputation: 331
I really don't think that you have to be at all concerned about IRQ sharing because the data is sharing the same data bus anyway. What do you imagine that you will gain?
 
Old 12-29-2005, 12:43 AM   #3
live_dont_exist
Member
 
Registered: Aug 2004
Location: India
Distribution: Redhat 9.0,FC3,FC5,FC10
Posts: 257

Original Poster
Rep: Reputation: 30
Hey,
Thnx 4 replying...Im doing this for the first time so I was just following t he kernel documentation in proc.txt. It says that its possible to assign processes to CPU's.

The reason why:

My Etherenet card (eth2) is tied to IRQ 169 and IRQ169 seems to be tied to Processor 2 all the time and we were dropping a few packets (we could see it in our compiled version of Snort) so I was just thinking ; its because the card is getting so many interrupts ( over 2000000 were listed in /proc/interrupts) for IRQ 169...if I managed to rid CPU 2 of its duties in handling all other interrupts except 169 I could see a slight performance raise...

That was my logic...I did manage to disable irqbalance and set stuff individually but today morning when I cam back to work everything was back to ffffffff even though Irqbalance was off.

Let me know if the logic Im using has a hole in it and why?

Thnx again
Arvind
 
Old 12-29-2005, 04:47 AM   #4
stress_junkie
Senior Member
 
Registered: Dec 2005
Location: Massachusetts, USA
Distribution: Ubuntu 10.04 and CentOS 5.5
Posts: 3,873

Rep: Reputation: 331Reputation: 331Reputation: 331Reputation: 331
I'll look into it and edit this post with any information that I find. Now I know what you expected to achieve. My first impression is that you may be right. I haven't heard of a hardware device being assigned to a CPU.
 
Old 12-29-2005, 10:59 AM   #5
live_dont_exist
Member
 
Registered: Aug 2004
Location: India
Distribution: Redhat 9.0,FC3,FC5,FC10
Posts: 257

Original Poster
Rep: Reputation: 30
Thnx dude..appreciate the effort...just 1 more pointer though..I ws speaking to my boss 2day about this ...and he was of the logic that if ppl were given that much control to play around with Linux they might end up wrecking it ; so these tweaks which we do in /proc...btw I tweaked a lot of virtual memory today...might be when we've almost reached a target and we want to do some final changes to squeeze that little extra out of the system...sounded logical...coz the changes I made to the memory dont seem to add too much...

But still do check out stuff n let me know whenever u hav the time...

Cheers
Arvind
p.s....My Boss loves Windows .. so we had a nice discussion abt how mucxh bettr Linux was...futile though ..ppl never do understand...
 
Old 12-30-2005, 09:46 AM   #6
stress_junkie
Senior Member
 
Registered: Dec 2005
Location: Massachusetts, USA
Distribution: Ubuntu 10.04 and CentOS 5.5
Posts: 3,873

Rep: Reputation: 331Reputation: 331Reputation: 331Reputation: 331
I've had a bit of fun looking for information on this subject. Once I figured out a good Google search phrase I found that most of the listings referred to only two separate pieces of information. One of these references talks about balancing network cards over multiple CPUs. This makes me think that you've already read this. Nevertheless here are the two documents that I found.

Short note on setting and testing IRQ affinity.
http://www.kernel.org/pub/scm/linux/...tation/IRQ-aff
inity.txt,v

Short but informative article on IRQ affinity.
http://blog.spoonix.org/?page_id=678

I know that process affinity isn't part of this discussion but I found one article on that subject that is interesting. I thought was interesting.

Article on software affinity. Binding processes to a particular CPU.
http://www.linuxjournal.com/article/6799
 
Old 01-01-2006, 11:53 PM   #7
live_dont_exist
Member
 
Registered: Aug 2004
Location: India
Distribution: Redhat 9.0,FC3,FC5,FC10
Posts: 257

Original Poster
Rep: Reputation: 30
Thnx stress...will definitely take a look at the two articles you posted...just FYI..coz this is interesting..I turned irqbalance off....

as in chkconfig --level 35 irqbalance off...so its gone ...

and then we ran our performance tests again and guess what ...the performance improved straight away dramatically !! ... I looked into /proc/irq/0 etc and every single smp_affinity file had set itself to ffffffff ...strange?

If i do a cat /proc/interrupts...I see that everything (all interrupts) have bound tehmselves to processor 0 and the other procesors are all having a day off...but the performance is better...dramatically better...

Am doing a test right now to check if I was wrong , have turned irqbalance back on...will keep this post going till I understand whats going on...

Thnx 4 the links again
Cheers
Arvind
 
Old 01-14-2006, 02:01 PM   #8
spoonix
LQ Newbie
 
Registered: Jan 2006
Location: TX
Distribution: debian
Posts: 2

Rep: Reputation: 0
Quote:
Originally Posted by live_dont_exist
and then we ran our performance tests again and guess what ...the performance improved straight away dramatically !! ... I looked into /proc/irq/0 etc and every single smp_affinity file had set itself to ffffffff ...strange?
Nah, that's expected. Think of /proc/irq/X/smp_affinity as kind of like a netmask... the numbers you put in there specify which processors the IRQ is allowed to run on. The default setting of ffffffff means "this IRQ may be handled by any of the processors in this machine".

As for the irqbalance mystery, yeah, it was responsible for the changing IRQ masks. That's what it does... randomly rolls the masks around in order to make sure that each proc spends a "fair" amount of time handling interrupts. Personally, I'd rather tie an interrupt to a processor and not worry about it anymore.

Also, as a personal opinion, I'd reccomend checking your system to make sure that you're not running Intel's Hyperthreading. I know some folks swear it improves performance, but I've never had any luck with it on a Linux server and I've found the best thing to do is just disable it. I don't have any experience with real dual procs, but from what I've read it sounds to me like those should be ok.

Good luck,
 
Old 01-15-2006, 09:04 AM   #9
trickykid
Guru
 
Registered: Jan 2001
Posts: 24,133

Rep: Reputation: 197Reputation: 197
Quote:
Originally Posted by live_dont_exist
The reason why:

My Etherenet card (eth2) is tied to IRQ 169 and IRQ169 seems to be tied to Processor 2 all the time and we were dropping a few packets (we could see it in our compiled version of Snort) so I was just thinking ; its because the card is getting so many interrupts ( over 2000000 were listed in /proc/interrupts) for IRQ 169...if I managed to rid CPU 2 of its duties in handling all other interrupts except 169 I could see a slight performance raise...

That was my logic...I did manage to disable irqbalance and set stuff individually but today morning when I cam back to work everything was back to ffffffff even though Irqbalance was off.
Are you sure your talking about IRQ's? IRQ numbers only go up to 15, starting with 0, how are you getting an IRQ of 169 would be my question..
 
Old 01-15-2006, 09:35 PM   #10
spoonix
LQ Newbie
 
Registered: Jan 2006
Location: TX
Distribution: debian
Posts: 2

Rep: Reputation: 0
Quote:
Originally Posted by trickykid
Are you sure your talking about IRQ's? IRQ numbers only go up to 15, starting with 0, how are you getting an IRQ of 169 would be my question..
Through the (black) magic of APIC.

Here's some technical info on it that should give you a good idea of what's up without having to plod through Intel tech references (sorry for the obfuscation... the forums say my post count is too low to put up a link)

http :// osdev.berlios.de / pic.html

I don't fully understand it as I'm not a hardware dude, but APIC is basically necessary for handling interrupt addressing on multiple CPU systems and allowing the CPUs to talk to each other. Somewhere along the way, the comp engineer geeks found some other tricky stuff they could use APIC to do on uniprocessor machines and so by now it's pretty much become a standard on any modern mobo made today, and thus we get stuck with those annoying but harmless "spurious interrupt" messages.

APIC is basically a 24 line table that holds 64 bits each and uses bitmasks to identify which processor the interrupt is associated with. So, if you have a scsi card on a 4 proc system that happens to "belong" to CPU2 and it's IRQ is set to 15, then APIC will map that out to 63 so that CPU0, CPU1, and CPU3 can all reference it.

CPU0 - 0-23
CPU1 - 24-47
CPU2 - 48-71
CPU3 - 72-95
etc

I think.

The exact numbers might be wrong, but that's the basic idea.

This would be an excellent time for a comp engineer or kernel hacker to step forward.

Any rate, the important thing to remember here is that it's just all Intel's fault.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
IRQ Balancing DISABLED = performance BOOST! pr1268 LinuxQuestions.org Member Success Stories 1 02-06-2005 05:32 PM
how to do load balancing? yenonn Linux - Networking 2 07-27-2004 09:20 PM
load balancing on rh ntaizi Linux - Software 0 12-17-2003 06:41 AM
Load Balancing? gsibble Linux - Networking 3 12-09-2003 10:39 PM
Load balancing ?? Lucsi Linux - Newbie 1 07-16-2002 12:54 PM


All times are GMT -5. The time now is 09:27 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration