LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 05-13-2009, 02:56 PM   #1
bcg121
LQ Newbie
 
Registered: Oct 2007
Location: Pennsylvania
Posts: 19

Rep: Reputation: 0
Using gdb to analyze core dump caused by strlen() SIGSEGV


My C program is very occasionally causing a core dump. When I run gdb to analyze the core dump (gdb <program name> -c core) and do a backtrace, I get the following:
(gdb) bt
#0 0x40d478 in strlen ()
#1 0x174aed8 in ?? ()
Why doesn't the backtrace tell me who was calling strlen()? Are there any other commands I can use to determine this? This problem is very difficult to reproduce, so I'd really like to extract the necessary info from this core file.

In a previous experiment I put code in the application that intentionally causes it to segfault, and in those cases the backtrace was much more helpful.

Thanks in advance for your advice. And please let me know if you need more information.
 
Old 05-13-2009, 08:42 PM   #2
WildPossum
Member
 
Registered: Feb 2004
Location: Sydney - Australia
Distribution: Ubuntu, OpenSUSE, Mythbuntu, Embedded Linux
Posts: 46

Rep: Reputation: 18
Smile

Usually each time such a string lib (glibc, newlib etc) aborts it is becasue either:
A> the length of the string is outside the internal str buffer expectation (OS implementation set)
B> the source string your trying to get its length of is NOT zero terminated.
C> If the source string is a pointer said pointer does not see within set bounds either a NULL or a decisive determination char ($).

You can verify your lib actions by looking into the source code which is freely available.
Do a google for its home source location.

Best programming alterative is always use the strncpy/strncmp etc.. where n dictates the max character length copied/compared, then force appended zero byte termination. It is up to you to do this, it is expected with good programming. I usually get such issues when ever the programm cuts and patse a lot of sub-strings before I get teh final string required. This is usally the caes when building a FQDN or FQpath and file name.

Hope this assists.
 
Old 05-13-2009, 09:33 PM   #3
cetialphav
Member
 
Registered: Sep 2003
Location: Raleigh, NC, USA
Distribution: Fedora
Posts: 88

Rep: Reputation: 16
It is quite possible that the contents of the stack are being overwritten due to a buffer overflow. When the segmentation fault happens later, a core file is written with the contents of this corrupted stack.

It is also possible that there are no symbols available for that address. Was everything compiled with the -g option?

One way I attack these kinds of problems is to set the MALLOC_CHECK_ environment variable to 3 when running the program. This causes sanity checks to be done by the C runtime and aborts the program (with a core dump) as soon as something fishy is detected. This often prevents the stack from being messed up by crashing earlier.

Valgrind is also helpful too, sometimes. It can detect various memory management errors so fixing what it reports may solve your problem.
 
Old 05-14-2009, 07:50 AM   #4
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by bcg121 View Post
(gdb) bt
#0 0x40d478 in strlen ()
#1 0x174aed8 in ?? ()
Sometimes gdb simply fails to understand the stack frame at the point where crash occurs. So it is possible that 0x174aed8 is not the correct return address.

If you know a little bit about x86 assembler, you could display a little of the stack and display the disassembly of strlen and understand the stack frame that gdb didn't understand so you could find the correct return address.

0x174aed8 is a rather large value for a return address. That would not be an address in your main executable. Maybe it could be in a .so file (that has no symbols). Maybe it is in executable data. But more likely it is just wrong.

strlen did not corrupt the stack. Earlier stack corruption would not destroy the return address to strlen. So stack corruption is a possibility here, but if this is stack corruption then even the identification of strlen is incorrect. More likely, strlen is correct, the value passed to strlen was bad and caused the fault, and gdb is misunderstanding the stack frame.

Or is this process multi threaded? If it is multi threaded then some other thread might have trashed this thread's stack during the execution of strlen.

Last edited by johnsfine; 05-14-2009 at 07:51 AM.
 
Old 05-15-2009, 05:28 AM   #5
bcg121
LQ Newbie
 
Registered: Oct 2007
Location: Pennsylvania
Posts: 19

Original Poster
Rep: Reputation: 0
Thanks for the responses!

My application is multi-threaded, but I do not think any of the other threads would have trashed this thread's stack. This is because I am able to use gdb to get a backtrace of the other four threads, and they are all suspended (sem_timed_wait, select, select, and blocking read, respectively).

I am only using static libraries (.a), no dynamic libraries (.so).

Unfortunately I am not knowledgeable enough to look at the Memory myself and figure out who the caller is...

So, either there is stack corruption or gdb cannot interpret the stack. Obviously I am hoping for the latter, otherwise I have absolutely nothing to go on. My current plan is to review all of my string handling. Specifically, I am going to explicitly force null termination before all calls to strlen.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Core dump analyzing by gdb igalch Linux - Newbie 5 07-01-2013 04:04 AM
analyze process dump using GDB kskkumar Linux - Software 2 05-16-2009 05:41 AM
gdb & core dump Alexlun Programming 4 04-03-2009 10:35 AM
where are core dump files and how to analyse them using gdb manish.chauhan Linux - Software 1 11-21-2007 01:59 AM
gdb - analyze crashes Ephracis Programming 4 08-26-2006 05:11 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 10:52 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration