LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 03-01-2012, 06:31 AM   #1
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
tar and catppt segfaulting. How to analyse the reason?


/var/log/messages includes several lines each day like
Code:
Feb 29 23:21:47 LS1 kernel: [400016.298450] tar[2486]: segfault at 0 ip 000000000042d050 sp 00007fffb1e0c0a8 error 4 in tar[400000+55000]
Mar  1 00:28:36 LS1 kernel: [404025.303387] catppt[5544]: segfault at 22309d1 ip 0000000000403733 sp 00007fff53c16478 error 4 in catppt[400000+6000]
They occur regularly when amanda runs tar and when omindex runs catppt, both as scheduled jobs.

How to find the cause?

Why would whatever is wrong cause only these two executables to segfault?

Is it reasonable to hypothesise that it is unlikely to be caused by defective memory because the segfaults occur so regularly and only to these two executables (unlikely they load in the same location and likely other problems would arise from bad memory)?

Both programs were installed from the distro repositories (Debian Squeeze 64-bit).

I reproduced the catppt fault at the command line (did not try with tar).

Here's the end of a strace output
Code:
lseek(3, 0, SEEK_SET)                   = 0
read(3, "\320\317\21\340\241\261\32\341\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0>\0\3\0\376\377\t\0"..., 4096) = 4096
lseek(3, 8671232, SEEK_SET)             = 8671232
read(3, "\201B\0\0\202B\0\0\203B\0\0\204B\0\0\205B\0\0\206B\0\0\207B\0\0\210B\0\0"..., 4096) = 4096
lseek(3, 3538944, SEEK_SET)             = 3538944
read(3, "QZ|\36w\371m\367\367&R\346\266\226\260QE\0252\217-\265\275\312\366\236_\217\374\0\242\212"..., 4096) = 4096
lseek(3, 4272128, SEEK_SET)             = 4272128
read(3, "\313t \211\272\262\22\36\354Vx7\\\232\277`\264\220*X3\242h\241O\255\346\253D\347D+"..., 4096) = 4096
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++
Segmentation fault
Here's an entire gdb output:
Code:
c@LS1:~$ gdb catppt
GNU gdb (GDB) 7.0.1-debian
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/catppt...(no debugging symbols found)...done.
(gdb)  run -dutf-8 <pathname not shown>.ppt
Starting program: /usr/bin/catppt -dutf-8 <pathname not shown>.ppt

Program received signal SIGSEGV, Segmentation fault.
0x0000000000403733 in ?? ()
(gdb) quit
A debugging session is active.

	Inferior 1 [process 25443] will be killed.

Quit anyway? (y or n) y
Wondering if both use an unusual library which has been corrupted ...
Code:
c@LS1:~$ ldd /bin/tar
	linux-vdso.so.1 =>  (0x00007ffff9195000)
	librt.so.1 => /lib/librt.so.1 (0x00007f78ee7a9000)
	libc.so.6 => /lib/libc.so.6 (0x00007f78ee447000)
	libpthread.so.0 => /lib/libpthread.so.0 (0x00007f78ee22a000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f78ee9bb000)
c@LS1:~$ ldd /usr/bin/catppt
	linux-vdso.so.1 =>  (0x00007fffa93a9000)
	libm.so.6 => /lib/libm.so.6 (0x00007fc9bd85a000)
	libc.so.6 => /lib/libc.so.6 (0x00007fc9bd4f8000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fc9bdae6000)
But none of the libraries used by both are unusual (they are the three most commonly used libraries in the system):
Code:
find /bin /usr/bin -type f -perm /a+x -exec ldd {} \; | grep so | sed -e '/^[^\t]/ d' | sed -e 's/\t//' | sed -e 's/ =.*//' | sed -e 's/ (0.*)//' | sort | uniq -c | sort -n > /tmp/trash
c@LS1:~$ for lib in linux-vdso.so.1 libc.so.6 /lib64/ld-linux-x86-64.so.2; do grep "$lib" /tmp/trash; done
    573 linux-vdso.so.1
    573 libc.so.6
    573 /lib64/ld-linux-x86-64.so.2
What to do?
 
Old 03-01-2012, 11:41 AM   #2
smallpond
Senior Member
 
Registered: Feb 2011
Location: Massachusetts, USA
Distribution: Fedora
Posts: 4,138

Rep: Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263
My guess is that they are hitting a bad disk file. Try fsck.
 
Old 03-02-2012, 02:39 AM   #3
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578

Original Poster
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
Update

Comparing md5sums for the binaries and the (real) libraries they use on the development server shows they are the same so file system problems including binary and library corruption can be ruled out.

Running the failing catppt command on the development server also segfaults so hardware defects can be ruled out, leaving a catppt bug as the most likely explanation. Bug reported here.

Next step: reproduce the tar segfault outside amanda. Incidentally, there are a few current tar segfault reports:
 
Old 03-07-2012, 10:23 PM   #4
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578

Original Poster
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
Update

The tar segfault was worked around by re-configuring what amanda backs up. Changed directory layout had resulted in some of the /var/lib/amanda/gnutar-lists/* configuration files not matching anything on the file system.

Surprisingly, considering how old tar is and how heavily it is used, it is a bug in tar that caused the segfault. It seems to be associated with changing directory to / (or the chrooted equivalent?). Unfortunately I am not the amanda administrator so I do not know enough to submit a useful bug report.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
how can i decompress this tar.tar file? hmmm sounds new.. tar.tar.. help ;) kublador Linux - Software 14 10-25-2016 02:48 AM
Reason for following the "MENTIONED tar installation sequence" in LFS 6.8 dgashu Linux From Scratch 4 09-22-2011 07:21 AM
BackUp & Restore with TAR (.tar / .tar.gz / .tar.bz2 / tar.Z) asgarcymed Linux - General 5 12-31-2006 02:53 AM
slackware 9.0 everything segfaulting ranger12002 Slackware 6 10-31-2003 05:47 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 04:27 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration