/var/log/messages includes several lines each day like
Code:
Feb 29 23:21:47 LS1 kernel: [400016.298450] tar[2486]: segfault at 0 ip 000000000042d050 sp 00007fffb1e0c0a8 error 4 in tar[400000+55000]
Mar 1 00:28:36 LS1 kernel: [404025.303387] catppt[5544]: segfault at 22309d1 ip 0000000000403733 sp 00007fff53c16478 error 4 in catppt[400000+6000]
They occur regularly when amanda runs tar and when omindex runs catppt, both as scheduled jobs.
How to find the cause?
Why would whatever is wrong cause only these two executables to segfault?
Is it reasonable to hypothesise that it is unlikely to be caused by defective memory because the segfaults occur so regularly and only to these two executables (unlikely they load in the same location and likely other problems would arise from bad memory)?
Both programs were installed from the distro repositories (Debian Squeeze 64-bit).
I reproduced the catppt fault at the command line (did not try with tar).
Here's the end of a strace output
Code:
lseek(3, 0, SEEK_SET) = 0
read(3, "\320\317\21\340\241\261\32\341\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0>\0\3\0\376\377\t\0"..., 4096) = 4096
lseek(3, 8671232, SEEK_SET) = 8671232
read(3, "\201B\0\0\202B\0\0\203B\0\0\204B\0\0\205B\0\0\206B\0\0\207B\0\0\210B\0\0"..., 4096) = 4096
lseek(3, 3538944, SEEK_SET) = 3538944
read(3, "QZ|\36w\371m\367\367&R\346\266\226\260QE\0252\217-\265\275\312\366\236_\217\374\0\242\212"..., 4096) = 4096
lseek(3, 4272128, SEEK_SET) = 4272128
read(3, "\313t \211\272\262\22\36\354Vx7\\\232\277`\264\220*X3\242h\241O\255\346\253D\347D+"..., 4096) = 4096
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++
Segmentation fault
Here's an entire gdb output:
Code:
c@LS1:~$ gdb catppt
GNU gdb (GDB) 7.0.1-debian
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/catppt...(no debugging symbols found)...done.
(gdb) run -dutf-8 <pathname not shown>.ppt
Starting program: /usr/bin/catppt -dutf-8 <pathname not shown>.ppt
Program received signal SIGSEGV, Segmentation fault.
0x0000000000403733 in ?? ()
(gdb) quit
A debugging session is active.
Inferior 1 [process 25443] will be killed.
Quit anyway? (y or n) y
Wondering if both use an unusual library which has been corrupted ...
Code:
c@LS1:~$ ldd /bin/tar
linux-vdso.so.1 => (0x00007ffff9195000)
librt.so.1 => /lib/librt.so.1 (0x00007f78ee7a9000)
libc.so.6 => /lib/libc.so.6 (0x00007f78ee447000)
libpthread.so.0 => /lib/libpthread.so.0 (0x00007f78ee22a000)
/lib64/ld-linux-x86-64.so.2 (0x00007f78ee9bb000)
c@LS1:~$ ldd /usr/bin/catppt
linux-vdso.so.1 => (0x00007fffa93a9000)
libm.so.6 => /lib/libm.so.6 (0x00007fc9bd85a000)
libc.so.6 => /lib/libc.so.6 (0x00007fc9bd4f8000)
/lib64/ld-linux-x86-64.so.2 (0x00007fc9bdae6000)
But none of the libraries used by both are unusual (they are the three most commonly used libraries in the system):
Code:
find /bin /usr/bin -type f -perm /a+x -exec ldd {} \; | grep so | sed -e '/^[^\t]/ d' | sed -e 's/\t//' | sed -e 's/ =.*//' | sed -e 's/ (0.*)//' | sort | uniq -c | sort -n > /tmp/trash
c@LS1:~$ for lib in linux-vdso.so.1 libc.so.6 /lib64/ld-linux-x86-64.so.2; do grep "$lib" /tmp/trash; done
573 linux-vdso.so.1
573 libc.so.6
573 /lib64/ld-linux-x86-64.so.2
What to do?