migrate an old kernel module (working in single core) to dual core - repost
--------------------------------------------------------------------------------
Hi,
I have a kernel module code that is working fine in an old single core intel PC with linux kernel 2.4.20-8 (RedHat9). Now, i have got a new PC which has got two processors ( I guess).
=======
# cat cpuinfo
processor : 0, vendor_id : GenuineIntel, model name: Intel(R) Pentium(R) 4 CPU 3.00GHz
processor : 1, vendor_id: GenuineIntel, model name: Intel(R) Pentium(R) 4 CPU 3.00GHz
========
I compiled my kernel module and inserted it using insmod.
I getting a kernel panic, which was not happening when I run my code in the old PC.
My kernel module does the following.
==================
Starts with module_init ()
As part of initialization, it does "dev_add_pack (¶m);".
Whenever a packet is received, it results in calling "my_call_back_1".
my_call_back_1 constructs a ring buffer and calls "schedule_task (&q);".
When the scheduler schedules my_call_back_2 function, it takes care of processing the packet.
==================
When I send certain traffic to the "eth0", the kernel panic happens.
I see the backtrace as follows, where I see MULTIPLE context_thread.
Jul 24 10:37:35 localhost kernel: [<c01342a9>] context_thread [kernel] 0x149 (0xf7fc7f9c))
Jul 24 10:37:35 localhost kernel: [<c0134160>] context_thread [kernel] 0x0 (0xf7fc7fc4))
Jul 24 10:37:35 localhost kernel: [<c0134160>] context_thread [kernel] 0x0 (0xf7fc7fe0))
Jul 24 10:37:35 localhost kernel: [<c010759d>] kernel_thread_helper [kernel] 0x5 (0xf7fc7ff0))
Is this Jul 24 10:37:35 localhost kernel: [<c01342a9>] context_thread [kernel] 0x149 (0xf7fc7f9c))
Jul 24 10:37:35 localhost kernel: [<c0134160>] context_thread [kernel] 0x0 (0xf7fc7fc4))
Jul 24 10:37:35 localhost kernel: [<c0134160>] context_thread [kernel] 0x0 (0xf7fc7fe0))
Jul 24 10:37:35 localhost kernel: [<c010759d>] kernel_thread_helper [kernel] 0x5 (0xf7fc7ff0))
NOTE: Sometimes, I see "alloc_skb" in the callback trace, sometimes I see '__kfree_skb" in the callback trace and hence I am not sure whether both the processors are trying to access the same code.
I am not sure whether the issue is due to DUAL processor or dual core.
Any special care required w.r.t semaphores?
Are the interrupts from device is getting processed by both the processors simulatenously and resulting in race conditin?
Can anyone help me to resolve this issue?
regards,
Senthil.
|