Segmentation Fault on Linux Red Hat AS release 4 (Nahant Update 4): 2.6.9-42.ELsmp
Recently after reboot, our server is generating Segfault on running one application (Rational Tau 3.0). Below is the error log:
/home1/env/tau3.0/bin/taubatch --project ../Model/TRXHM.ttp --config Target -G Telelogic Tau Application Builder make: *** [generate] Segmentation fault (core dumped) The server was rebooted after some firmware upgrade on Network storage and no changes were made to Server Settings. Interestingly this problem is coming only for a particular revision (Tau 3.0) while other revision is working fine. Application (tau 3.0) is available on a network share (mounted on server) and is working fine on other machines. I'll appreciate any help in debugging this issue. Thanks in advance. Below is strace output for reference: ------------------------------------- readlink("/proc/self/fd/0", "/dev/pts/1", 1023) = 10 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 rt_sigaction(SIGPIPE, {0x16fc184, [], SA_RESTORER, 0xc25898}, {SIG_IGN}, 8) = 0 socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3 fcntl64(3, F_SETFD, FD_CLOEXEC) = 0 stat64("/home/hindustan.psbuild/.flexlmrc", {st_mode=S_IFREG|0664, st_size=99, ...}) = 0 fcntl64(3, F_GETFL) = 0x2 (flags O_RDWR) fcntl64(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 connect(3, {sa_family=AF_INET, sin_port=htons(19353), sin_addr=inet_addr("10.100.211.81")}, 16) = -1 EINPROGRESS (Operation now in progress) select(4, NULL, [3], NULL, {10, 0}) = 1 (out [3], left {10, 0}) getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0 send(3, "h\00413hindustan.psbuild\0\0\0\0VNL-HIN"..., 147, 0) = 147 gettimeofday({1293082231, 917489}, {4294966966, 0}) = 0 select(4, [3], NULL, NULL, {10, 0}) = 1 (in [3], left {10, 0}) gettimeofday({1293082231, 917842}, {4294966966, 0}) = 0 gettimeofday({1293082231, 917876}, {4294966966, 0}) = 0 gettimeofday({1293082231, 917916}, {4294966966, 0}) = 0 select(4, [3], NULL, NULL, {10, 0}) = 1 (in [3], left {10, 0}) gettimeofday({1293082231, 917977}, {4294966966, 0}) = 0 recv(3, "/\210,\215\0(\1\23\0\0\25O\0\0\0\0\0\0\0\0", 20, 0) = 20 recv(3, "TLicense\0\0\0K\232\0\0\0\0\0TD", 20, 0) = 20 shutdown(3, 2 /* send and receive */) = 0 close(3) = 0 socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3 fcntl64(3, F_SETFD, FD_CLOEXEC) = 0 stat64("/home/hindustan.psbuild/.flexlmrc", {st_mode=S_IFREG|0664, st_size=99, ...}) = 0 gettimeofday({1293082231, 918341}, NULL) = 0 open("/etc/resolv.conf", O_RDONLY) = 4 fstat64(4, {st_mode=S_IFREG|0644, st_size=25, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fa2000 read(4, "nameserver 10.100.200.55\n", 4096) = 25 read(4, "", 4096) = 0 close(4) = 0 munmap(0xb7fa2000, 4096) = 0 uname({sys="Linux", node="VNL-HINDUSTAN", ...}) = 0 socket(PF_FILE, SOCK_STREAM, 0) = 4 fcntl64(4, F_GETFL) = 0x2 (flags O_RDWR) fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0 connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = 0 poll([{fd=4, events=POLLOUT|POLLERR|POLLHUP, revents=POLLOUT}], 1, 5000) = 1 send(4, "\2\0\0\0\r\0\0\0\6\0\0\0hosts\0\256\0", 20, MSG_NOSIGNAL) = 20 poll([{fd=4, events=POLLIN|POLLERR|POLLHUP, revents=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1 recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"hosts\0", 6}], msg_controllen=16, {cmsg_len=16, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, {5}}, msg_flags=0}, 0) = 6 fstat64(5, {st_mode=S_IFREG|0600, st_size=217016, ...}) = 0 pread64(5, "\1\0\0\0h\0\0\0\232\4\0\0\1\0\0\0\\\"\17M\0\0\0\0\323\0"..., 104, 0) = 104 mmap2(NULL, 217016, PROT_READ, MAP_SHARED, 5, 0) = 0xb7f6e000 close(5) = 0 close(4) = 0 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV (core dumped) +++ Process 22137 detached |
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
Is that trying to address ram address 0x0000? It dumped a core so you can play with gdb, but Segmentation faults are memory faults. If the memory is good (Is it?) then it's a genuine crash - implying faulty software, disk, or possibly source. Even handier than memtest86 often is to try an up arrow & return. If the compile continues any distance, you have a memory problem. |
Thanks for your response. Will try to debug further.
|
Quote:
Code:
gdb /path/to/your_app core_file.pid |
Thanks Quanta.
I have already tried that but not able to debug; below is the o/p of gdb >gdb generate core.8630 GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu"...generate: No such file or directory. Core was generated by ` hindustan.psbuild x /home/hindustan.psbuild /bin/bash '. Program terminated with signal 11, Segmentation fault. #0 0x00a29a2c in ?? () (gdb) bt #0 0x00a29a2c in ?? () #1 0x0191ce77 in ?? () #2 0xbfff7ae4 in ?? () #3 0xbfff6f2c in ?? () #4 0x019451d2 in ?? () #5 0xbfff6ec0 in ?? () #6 0xbfff6f2c in ?? () #7 0x00000c00 in ?? () #8 0xbfff6ebc in ?? () #9 0x51d3640a in ?? () #10 0x00000000 in ?? () Could you help debug it further. Thanks! |
Oh, I don't know why did you get all of "??" in the backtrace info.
Did you notice of that line: Quote:
|
Since process does not complete; target generate is not generated. But since app name is required for gdb i just mentioned it (just a dummy name).
From my debugging till now, it seems there is some issue with /var/run/nscd/socket, since removing this file and killing nscd process solves the problem for me. But i am not sure what could be its repercussions. I need to further see why nscd process is creating problem what would be the impact if I don't run it on the system. Anyway, thanks again to Quanta and Business_Kid for the your help. Regards, Vivek |
All times are GMT -5. The time now is 11:49 PM. |