LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 12-26-2012, 04:27 AM   #1
akiuni
Member
 
Registered: Sep 2012
Location: France
Distribution: debian
Posts: 56

Rep: Reputation: Disabled
kernel SMP + SLUB issues ?


Hi all

I'm facing some difficulties with linux kernels on servers equiped with 4 CPU sockets (4x8 cores + Hyperthreading = 64 cores at all) when SLUB management is enabled.

Here are some examples :
- 4CPU + SLUB + kernel 2.6.35-14 = kernel panic after 1h10 minutes uptime (stacktrace at the end of my post).

- 4CPU + SLUB + kernel 2.6.39 = very poor networking performances. A bootleneck seems to appears in the softirqs which are concentrated among cpu0 to cpu3

- 4CPU + SLUB + kernel 3.2.34 = same behavior

As a consequence, I'm wondering if there any known issue between SMP and SLUB kernel options ?

CONFIG_SLUB=y
CONFIG_SLUB_DEBUG=y
CONFIG_SMP=y
CONFIG_X86_64_SMP=y
CONFIG_USE_GENERIC_SMP_HELPERS=y
CONFIG_SCSI_SAS_HOST_SMP=y

Also, do you have any idea about an option that I may enable to as to improve the SMP/SLUB performances ?

thank you
best regards,
Julien

stacktrace :
Code:
Dec  5 17:07:10 Host kernel: BUG: unable to handle kernel paging request at 000000007f87312b
Dec  5 17:07:10 Host kernel: IP: [<ffffffff810c5228>] __d_lookup+0x88/0x150
Dec  5 17:07:10 Host kernel: PGD 105c776067 PUD 0
Dec  5 17:07:10 Host kernel: Oops: 0000 [#1] SMP
Dec  5 17:07:10 Host kernel: last sysfs file: /sys/class/scsi_host/host2/proc_name
Dec  5 17:07:10 Host kernel: CPU 32
Dec  5 17:07:10 Host kernel: Modules linked in: pkp_drv
Dec  5 17:07:10 Host kernel:
Dec  5 17:07:10 Host kernel: Pid: 18757, comm: keepalived Not tainted 2.6.35.14-Host64 #7 ....../PowerEdge R810
Dec  5 17:07:10 Host kernel: RIP: 0010:[<ffffffff810c5228>]  [<ffffffff810c5228>] __d_lookup+0x88/0x150
Dec  5 17:07:10 Host kernel: RSP: 0018:ffff88105b9c5b98  EFLAGS: 00210206
Dec  5 17:07:10 Host kernel: RAX: 0000000000000005 RBX: 000000007f87312b RCX: 0000000000000017
Dec  5 17:07:10 Host kernel: RDX: 018721dfc6947272 RSI: ffff88105b9c5c68 RDI: ffff88107f810240
Dec  5 17:07:10 Host kernel: RBP: ffff88105b9c5be8 R08: ffff88105b9c5c68 R09: 000000000000ffff
Dec  5 17:07:10 Host kernel: R10: 0000000000000005 R11: 000000000000000a R12: ffff88105b9c5c68
Dec  5 17:07:10 Host kernel: R13: 0000000008927e69 R14: ffff88107f810240 R15: 0000000000000000
Dec  5 17:07:10 Host kernel: FS:  0000000000000000(0000) GS:ffff880002800000(0063) knlGS:00000000f742eb80
Dec  5 17:07:10 Host kernel: CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
Dec  5 17:07:10 Host kernel: CR2: 000000007f87312b CR3: 0000001061dfe000 CR4: 00000000000006e0
Dec  5 17:07:10 Host kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec  5 17:07:10 Host kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec  5 17:07:10 Host kernel: Process keepalived (pid: 18757, threadinfo ffff88105b9c4000, task ffff88105a6c5460)
Dec  5 17:07:10 Host kernel: Stack:
Dec  5 17:07:10 Host kernel: dead000000200200 0000000000007125 0000000000000005 00000005ffb57b88
Dec  5 17:07:10 Host kernel: <0> ffff88105b9c5c78 00000000000001ae ffff88105b9c5c68 ffff88107f810240
Dec  5 17:07:10 Host kernel: <0> 0000000000007125 0000000000000000 ffff88105b9c5c18 ffffffff810c532b
Dec  5 17:07:10 Host kernel: Call Trace:
Dec  5 17:07:10 Host kernel: [<ffffffff810c532b>] d_lookup+0x3b/0x60
Dec  5 17:07:10 Host kernel: [<ffffffff810c53c9>] d_hash_and_lookup+0x79/0xa0
Dec  5 17:07:10 Host kernel: [<ffffffff8110269d>] proc_flush_task+0x8d/0x250
Dec  5 17:07:10 Host kernel: [<ffffffff8103bf52>] release_task+0x32/0x3c0
Dec  5 17:07:10 Host kernel: [<ffffffff8103c817>] wait_consider_task+0x537/0x950
Dec  5 17:07:10 Host kernel: [<ffffffff8103cd1d>] do_wait+0xed/0x220
Dec  5 17:07:10 Host kernel: [<ffffffff8103cef1>] sys_wait4+0xa1/0xf0
Dec  5 17:07:10 Host kernel: [<ffffffff8103b580>] ? child_wait_callback+0x0/0x70
Dec  5 17:07:10 Host kernel: [<ffffffff8106c01f>] compat_sys_wait4+0x8f/0xd0
Dec  5 17:07:10 Host kernel: [<ffffffff810b1890>] ? vfs_read+0x140/0x180
Dec  5 17:07:10 Host kernel: [<ffffffff8102a82b>] sys32_waitpid+0xb/0x10
Dec  5 17:07:10 Host kernel: [<ffffffff81029cc5>] sysenter_dispatch+0x7/0x2b
Dec  5 17:07:10 Host kernel: Code: 05 8e 7a 83 00 48 8b 00 48 89 c3 8b 45 cc 48 85 db 48 89 45 c0 75 14 eb 5a 66 2e 0f 1f 84 00 00 00 00 00 48 8b 1b 48 85 db 74 48 <48> 8b 03 4c 8d 63 e8 0f 18 08 45 39 6c 24 30 75 e7 4d 39 74 24
Dec  5 17:07:10 Host kernel: RIP  [<ffffffff810c5228>] __d_lookup+0x88/0x150
Dec  5 17:07:10 Host kernel: RSP <ffff88105b9c5b98>
Dec  5 17:07:10 Host kernel: CR2: 000000007f87312b
Dec  5 17:07:10 Host kernel: ---[ end trace cd9ae1febb12caac ]---
Dec  5 17:08:02 Host kernel: BUG: unable to handle kernel paging request at 000000007f87312b
Dec  5 17:08:02 Host kernel: IP: [<ffffffff810c5228>] __d_lookup+0x88/0x150
Dec  5 17:08:02 Host kernel: PGD 105c6ba067 PUD 0
Dec  5 17:08:02 Host kernel: Oops: 0000 [#2] SMP
Dec  5 17:08:02 Host kernel: last sysfs file: /sys/class/scsi_host/host2/proc_name
Dec  5 17:08:02 Host kernel: CPU 5
Dec  5 17:08:02 Host kernel: Modules linked in: pkp_drv
Dec  5 17:08:02 Host kernel:
Dec  5 17:08:02 Host kernel: Pid: 29486, comm: ps Tainted: G      D     2.6.35.14-Host64 #7 ....../PowerEdge R810
Dec  5 17:08:02 Host kernel: RIP: 0010:[<ffffffff810c5228>]  [<ffffffff810c5228>] __d_lookup+0x88/0x150
Dec  5 17:08:02 Host kernel: RSP: 0018:ffff88105c707d18  EFLAGS: 00010206
Dec  5 17:08:02 Host kernel: RAX: 0000000000000005 RBX: 000000007f87312b RCX: 0000000000000017
Dec  5 17:08:02 Host kernel: RDX: 018721dfc6947272 RSI: ffff88105c707dc8 RDI: ffff88107f810240
Dec  5 17:08:02 Host kernel: RBP: ffff88105c707d68 R08: ffff88105c707dc8 R09: ffffffff81101cb0
Dec  5 17:08:02 Host kernel: R10: 0000000000000005 R11: 000000000000000a R12: ffff88105c707dc8
Dec  5 17:08:02 Host kernel: R13: 0000000008927e69 R14: ffff88107f810240 R15: ffff88105c707e78
Dec  5 17:08:02 Host kernel: FS:  0000000000000000(0000) GS:ffff8800024a0000(0063) knlGS:00000000f7659ad0
Dec  5 17:08:02 Host kernel: CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
Dec  5 17:08:02 Host kernel: CR2: 000000007f87312b CR3: 000000105c7a2000 CR4: 00000000000006e0
Dec  5 17:08:02 Host kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec  5 17:08:02 Host kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec  5 17:08:02 Host kernel: Process ps (pid: 29486, threadinfo ffff88105c706000, task ffff88105d2f7080)
Dec  5 17:08:02 Host kernel: Stack:
Dec  5 17:08:02 Host kernel: ffff88107f2a5fa0 ffff88107fc68000 0000000000000005 00000005811c5c49
Dec  5 17:08:02 Host kernel: <0> ffff88105c707e78 00000000000001ae ffff88105c707dc8 ffff88107f810240
Dec  5 17:08:02 Host kernel: <0> 0000000000000005 ffff88105c707e78 ffff88105c707d98 ffffffff810c532b
Dec  5 17:08:02 Host kernel: Call Trace:
Dec  5 17:08:02 Host kernel: [<ffffffff810c532b>] d_lookup+0x3b/0x60
Dec  5 17:08:02 Host kernel: [<ffffffff81101cb0>] ? proc_pid_instantiate+0x0/0xd0
Dec  5 17:08:02 Host kernel: [<ffffffff810fedd6>] proc_fill_cache+0x86/0x170
Dec  5 17:08:02 Host kernel: [<ffffffff810eed50>] ? compat_filldir+0x0/0xf0
Dec  5 17:08:02 Host kernel: [<ffffffff811023cd>] proc_pid_readdir+0x19d/0x200
Dec  5 17:08:02 Host kernel: [<ffffffff810eed50>] ? compat_filldir+0x0/0xf0
Dec  5 17:08:02 Host kernel: [<ffffffff810ba66c>] ? path_put+0x2c/0x40
Dec  5 17:08:02 Host kernel: [<ffffffff810eed50>] ? compat_filldir+0x0/0xf0
Dec  5 17:08:02 Host kernel: [<ffffffff810eed50>] ? compat_filldir+0x0/0xf0
Dec  5 17:08:02 Host kernel: [<ffffffff810fe8d5>] proc_root_readdir+0x45/0x60
Dec  5 17:08:02 Host kernel: [<ffffffff810c0d73>] vfs_readdir+0xb3/0xd0
Dec  5 17:08:02 Host kernel: [<ffffffff810f0a33>] compat_sys_getdents+0x83/0xe0
Dec  5 17:08:02 Host kernel: [<ffffffff81029cc5>] sysenter_dispatch+0x7/0x2b
Dec  5 17:08:02 Host kernel: Code: 05 8e 7a 83 00 48 8b 00 48 89 c3 8b 45 cc 48 85 db 48 89 45 c0 75 14 eb 5a 66 2e 0f 1f 84 00 00 00 00 00 48 8b 1b 48 85 db 74 48 <48> 8b 03 4c 8d 63 e8 0f 18 08 45 39 6c 24 30 75 e7 4d 39 74 24
Dec  5 17:08:03 Host kernel: RIP  [<ffffffff810c5228>] __d_lookup+0x88/0x150
Dec  5 17:08:03 Host kernel: RSP <ffff88105c707d18>
Dec  5 17:08:03 Host kernel: CR2: 000000007f87312b
Dec  5 17:08:03 Host kernel: ---[ end trace cd9ae1febb12caad ]---
NB: the same behavior occures with pkp_drv module unloaded.

additional informations :

Code:
# echo "Code: 05 8e 7a 83 00 48 8b 00 48 89 c3 8b 45 cc 48 85 db 48 89 45 c0 75 14 eb 5a 66 2e 0f 1f 84 00 00 00 00 00 48 8b 1b 48 85 db 74 48 <48> 8b 03 4c 8d 63 e8 0f 18 08 45 39 6c 24 30 75 e7 4d 39 74 24" | ./scripts/decodedecode
All code
========
   0:   05 8e 7a 83 00          add    $0x837a8e,%eax
   5:   48 8b 00                mov    (%rax),%rax
   8:   48 89 c3                mov    %rax,%rbx
   b:   8b 45 cc                mov    -0x34(%rbp),%eax
   e:   48 85 db                test   %rbx,%rbx
  11:   48 89 45 c0             mov    %rax,-0x40(%rbp)
  15:   75 14                   jne    0x2b
  17:   eb 5a                   jmp    0x73
  19:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  20:   00 00 00
  23:   48 8b 1b                mov    (%rbx),%rbx
  26:   48 85 db                test   %rbx,%rbx
  29:   74 48                   je     0x73
  2b:*  48 8b 03                mov    (%rbx),%rax     <-- trapping instruction
  2e:   4c 8d 63 e8             lea    -0x18(%rbx),%r12
  32:   0f 18 08                prefetcht0 (%rax)
  35:   45 39 6c 24 30          cmp    %r13d,0x30(%r12)
  3a:   75 e7                   jne    0x23
  3c:   4d                      rex.WRB
  3d:   39                      .byte 0x39
  3e:   74 24                   je     0x64

Code starting with the faulting instruction
===========================================
   0:   48 8b 03                mov    (%rbx),%rax
   3:   4c 8d 63 e8             lea    -0x18(%rbx),%r12
   7:   0f 18 08                prefetcht0 (%rax)
   a:   45 39 6c 24 30          cmp    %r13d,0x30(%r12)
   f:   75 e7                   jne    0xfffffffffffffff8
  11:   4d                      rex.WRB
  12:   39                      .byte 0x39
  13:   74 24                   je     0x39
Code:
# make tags
# vim -t __d_lookup

./fs/dcache.c
struct dentry * __d_lookup(struct dentry * parent, struct qstr * name)
{
        unsigned int len = name->len;
        unsigned int hash = name->hash;
        const unsigned char *str = name->name;
        struct hlist_head *head = d_hash(parent,hash);
        struct dentry *found = NULL;
        struct hlist_node *node;
        struct dentry *dentry;

        rcu_read_lock();

        hlist_for_each_entry_rcu(dentry, node, head, d_hash) {
                struct qstr *qstr;

                if (dentry->d_name.hash != hash)
                        continue;
                if (dentry->d_parent != parent)
                        continue;

                spin_lock(&dentry->d_lock);

                /*
                 * Recheck the dentry after taking the lock - d_move may have
                 * changed things.  Don't bother checking the hash because we're
                 * about to compare the whole name anyway.
                 */
                if (dentry->d_parent != parent)
                        goto next;

                /* non-existing due to RCU? */
                if (d_unhashed(dentry))
                        goto next;

                /*
                 * It is safe to compare names since d_move() cannot
                 * change the qstr (protected by d_lock).
                 */
                qstr = &dentry->d_name;
                if (parent->d_op && parent->d_op->d_compare) {
                        if (parent->d_op->d_compare(parent, qstr, name))
                                goto next;
                } else {
                        if (qstr->len != len)
                                goto next;
                        if (memcmp(qstr->name, str, len))
                                goto next;
                }

                atomic_inc(&dentry->d_count);
                found = dentry;
                spin_unlock(&dentry->d_lock);
                break;
next:
                spin_unlock(&dentry->d_lock);
        }
        rcu_read_unlock();

        return found;
}

Last edited by onebuck; 12-26-2012 at 07:21 AM. Reason: clean-up by use of 'code' tags
 
Old 12-26-2012, 06:24 AM   #2
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,292

Rep: Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322
Have you tried the latest kernel? There's a LOT of options now which refer to large numbers of smp in the help.
 
Old 12-26-2012, 07:20 AM   #3
akiuni
Member
 
Registered: Sep 2012
Location: France
Distribution: debian
Posts: 56

Original Poster
Rep: Reputation: Disabled
Hi, thank you for your prompt answer

well I've tried with the 3.2.31 kernel which is quite recent... (by the way I made a mistake in my initial post, it's not 3.2.34 but 3.2.31). I would prefer to use the 3.2 kernels because it's the one supported in debian squeeze 6.0.... Do you think there is a big gap between 3.2.31 and 3.2.35 (the latest 3.2 kernel) ?
 
Old 12-26-2012, 07:25 AM   #4
onebuck
Moderator
 
Registered: Jan 2005
Location: Central Florida 20 minutes from Disney World
Distribution: SlackwareŽ
Posts: 13,925
Blog Entries: 44

Rep: Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159
Moderator Response

Moved: This thread is more suitable in <Linux-General> and has been moved accordingly to help your thread/question get the exposure it deserves.
 
Old 12-27-2012, 02:29 AM   #5
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,128

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
Personally I wouldn't have thought a kernel oops query from some-one prepared to attempt to decode the issue deserves to be plonked in "general".

If you really think you have an issue in the memory buddy system, open a bug against it. Probably too intricate for many of us here to help much.
 
Old 12-27-2012, 03:13 AM   #6
akiuni
Member
 
Registered: Sep 2012
Location: France
Distribution: debian
Posts: 56

Original Poster
Rep: Reputation: Disabled
Well my knowledge stops where the code starts...

I can't open a bug because the oops seams to be solved in 2.6.39 kernel but with poor performances I suppose. Also I may have missed an important option in the .config, and that was the goal of my post...

I will try to locate more precisely the root cause of the bad performances because my only clue today is "SMP+SLUB+softirqs". Depending on what I find, I'll be back on LQ to post the answer !

Thank you
Julien
 
Old 12-27-2012, 03:15 AM   #7
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,292

Rep: Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322
A problem with SLUB on a multi-smp might well have gone to the 'kernel' mailing list, but whatever.

@akumi: I see this as a potential kernel bug. As such, they will want to make sure it hasn't been fixed. I compiled 3.7.1 recently, and that was where I noticed comments in the help about multi-smp boxes. So I would suggest 3.7.x, and if that doesn't fix it, file a kernel bug, and attach your .config.

EDIT: kernel options like MAXSMP, SCHED_MC, CROSS_MEMORY_ATTACH and many others become significant. One problem I see is that as the number of cpu cores rises, the hardware falls behind(e.g. number of memory pages), and ownership restrictions could get very complicated. Also enable SLUB_DEBUG. Never mind what people say about the kernel. If you are having bad behaviour, file a bug.

Last edited by business_kid; 12-27-2012 at 03:34 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Kernel bug in linux/mm/slub.c bweaver Debian 1 08-02-2011 06:41 AM
issues building 2.6.19.2 kernel with SMP somnambulist Linux - General 3 02-05-2007 08:35 PM
slack 11, huge26.s kernel, internet and SMP issues Twister512 Slackware 2 01-19-2007 11:49 AM
Hyperthread server goes to kernel panic with SMP kernic, boots ok with non SMP kernel abefroman Linux - Kernel 1 09-15-2006 05:43 PM
Any reason to keep a non-smp kernel installed on a smp FC3 machine? jim-j Fedora 2 03-12-2006 07:06 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 04:37 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration