LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel
User Name
Password
Linux - Kernel This forum is for all discussion relating to the Linux kernel.

Notices


Reply
  Search this Thread
Old 04-12-2009, 04:53 AM   #1
nkataria
LQ Newbie
 
Registered: Jun 2008
Posts: 20

Rep: Reputation: 0
Kernel Call Trace Order - Is it top to bottom OR vice-versa


Hello All,


I am facing a strange problem with my written program. It goes to zombie state. When I give "echo t > /proc/sysrq-trigger", I get the following in "/var/log/messages" file.

----------some-code-----------
Apr 12 14:51:12 localhost kernel: Call Trace:
Apr 12 14:51:12 localhost kernel: [<021209d0>] do_exit+0x386/0x390
Apr 12 14:51:12 localhost kernel: [<02106693>] do_divide_error+0x0/0xaa
Apr 12 14:51:12 localhost kernel: [<02118df5>] do_page_fault+0x2fd/0x4b4
Apr 12 14:51:12 localhost kernel: [<0214e2d3>] sys_close+0x0/0x61
Apr 12 14:51:12 localhost kernel: [<0214e2d3>] sys_close+0x0/0x61
Apr 12 14:51:12 localhost kernel: [<02142e0e>] __vma_link+0x4e/0x93
Apr 12 14:51:12 localhost kernel: [<02142eaf>] vma_link+0x5c/0x8d
Apr 12 14:51:12 localhost kernel: [<02140c53>] follow_page+0x128/0x134
Apr 12 14:51:12 localhost kernel: [<0214c50b>] rw_vm+0x20b/0x234
Apr 12 14:51:12 localhost kernel: [<02118af8>] do_page_fault+0x0/0x4b4
Apr 12 14:51:12 localhost kernel: [<0214e2d3>] sys_close+0x0/0x61

----------some-code-----------

Can some-one explain me the order of function call here ?
Did "do_exit" call "do_divide_error" OR vice-versa ?

Does "do_divide_error" function call says that my program some-where do "divide by zero operation" ?

Thanks and Regards,
Navneet Kataria
P.S. -> I am using Fedora Core 2 OS and 2.6.5-1.358smp kernel.
 
Old 04-13-2009, 08:28 AM   #2
titan22
LQ Newbie
 
Registered: Apr 2009
Posts: 17

Rep: Reputation: 3
It's bottom (caller) to top (callee). At some point in time do_divide_error() called do_exit().
 
Old 04-14-2009, 11:57 PM   #3
nkataria
LQ Newbie
 
Registered: Jun 2008
Posts: 20

Original Poster
Rep: Reputation: 0
Thanks for the reply, titan22.

Can you also help me in finding the cause of the problem ? As of now, I think It is happening because of some hardware issue. As the same application is running fine in other hardware set with same OS loaded in it. What kind of probable hardware issue it may be ?

--
Thanks and Regards,
Navneet Kataria
 
Old 04-17-2009, 10:06 AM   #4
titan22
LQ Newbie
 
Registered: Apr 2009
Posts: 17

Rep: Reputation: 3
Your hardware issue is likely the driver that implements "close". The driver may need different implementation for different hardware. Sometimes in /var/log/messages you can see the instruction EIP that is causing the problem. You can decipher that by gdb.

For example:
kernel: EIP is at d_instantiate+0x2d/0x56
[11:05am] /usr/src/linux-2.6.6-1.435.2.3.lair6smp
44 > gdb vmlinux
(gdb) info line *d_instantiate+0x2d
Line 66 of "list.h" starts at address 0xc0167d87
and ends at 0xc0167d8a <d_instantiate+48>.

Since only limited output is available and the stack dump does not give a very clear image. Two close() in one stack which is impossible. The best guess is that it crashed during do_page_fault().

Apr 12 14:51:12 localhost kernel: [<021209d0>] do_exit+0x386/0x390
Apr 12 14:51:12 localhost kernel: [<02106693>] do_divide_error+0x0/0xaa
Apr 12 14:51:12 localhost kernel: [<02118df5>] do_page_fault+0x2fd/0x4b4
Apr 12 14:51:12 localhost kernel: [<0214e2d3>] sys_close+0x0/0x61
Apr 12 14:51:12 localhost kernel: [<0214e2d3>] sys_close+0x0/0x61

Translate "do_page_fault+0x2fd/0x4b4" and "sys_close+0x0/0x61" to the corresponding line numbers. Check if any possible divide by zero occurs around the line number. do_page_fault() internally does not do any "divide" operation. It's likely that the divide operation is done in a function called by do_page_fault(). For example handle_mm_fault() (just a guess). Close() can be implemented by either a filesystem or network drivers.

You can also add assert() or panic() around the possible candidates and narrow down the problem. That should tell you the exact line number when the unexpected happens

Last edited by titan22; 04-17-2009 at 09:16 PM.
 
Old 04-27-2009, 04:30 AM   #5
nkataria
LQ Newbie
 
Registered: Jun 2008
Posts: 20

Original Poster
Rep: Reputation: 0
Thanks once again Titan22 !!

I tried my application on Red-hat ES OS. The issue seems to be resolved, since the application along the OS is running fine since 10 days.

BTW I could not debug FC2 kernel in my case, because I could not locate the "vmlinux" file for it. I guess I need to compile the kernel [from the provided source] for the file (vmlinux) to be generated.

So It looks like a hardware compatibility issue with FC2, which got resolved with RH-ES4.

One more thing I want to ask is that, Why different distributions (like Fedora, Debian) instrument the standard kernel? Also Why they don't clearly specify the changes they had done?
 
  


Reply

Tags
call, trace, trigger



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Kernel Call Trace - Help Me Understand It nkataria Linux - Newbie 1 04-12-2009 08:05 AM
kernel error and call trace archdave Slackware 0 02-17-2009 01:45 AM
m1 can ping m2, but not vice versa cmacklin Linux - Networking 4 10-31-2004 08:01 PM
XFCE4 running on top of KDE (or vice versa?) dsuratman Linux - Newbie 0 10-13-2004 12:04 PM
KDE under Gnome, or vice-versa Pres General 7 02-01-2004 10:18 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel

All times are GMT -5. The time now is 12:49 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration