LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Red Hat (https://www.linuxquestions.org/questions/red-hat-31/)
-   -   Segmentation Fault on Linux Red Hat AS release 4 (Nahant Update 4): 2.6.9-42.ELsmp (https://www.linuxquestions.org/questions/red-hat-31/segmentation-fault-on-linux-red-hat-as-release-4-nahant-update-4-2-6-9-42-elsmp-864941/)

vivek677 02-25-2011 03:10 AM

Segmentation Fault on Linux Red Hat AS release 4 (Nahant Update 4): 2.6.9-42.ELsmp
 
Recently after reboot, our server is generating Segfault on running one application (Rational Tau 3.0). Below is the error log:

/home1/env/tau3.0/bin/taubatch --project ../Model/TRXHM.ttp --config Target -G
Telelogic Tau Application Builder

make: *** [generate] Segmentation fault (core dumped)

The server was rebooted after some firmware upgrade on Network storage and no changes were made to Server Settings. Interestingly this problem is coming only for a particular revision (Tau 3.0) while other revision is working fine.

Application (tau 3.0) is available on a network share (mounted on server) and is working fine on other machines.

I'll appreciate any help in debugging this issue.

Thanks in advance.

Below is strace output for reference:
-------------------------------------

readlink("/proc/self/fd/0", "/dev/pts/1", 1023) = 10
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
rt_sigaction(SIGPIPE, {0x16fc184, [], SA_RESTORER, 0xc25898}, {SIG_IGN}, 8) = 0
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
stat64("/home/hindustan.psbuild/.flexlmrc", {st_mode=S_IFREG|0664, st_size=99, ...}) = 0
fcntl64(3, F_GETFL) = 0x2 (flags O_RDWR)
fcntl64(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(19353), sin_addr=inet_addr("10.100.211.81")}, 16) = -1 EINPROGRESS (Operation now in progress)
select(4, NULL, [3], NULL, {10, 0}) = 1 (out [3], left {10, 0})
getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
send(3, "h\00413hindustan.psbuild\0\0\0\0VNL-HIN"..., 147, 0) = 147
gettimeofday({1293082231, 917489}, {4294966966, 0}) = 0
select(4, [3], NULL, NULL, {10, 0}) = 1 (in [3], left {10, 0})
gettimeofday({1293082231, 917842}, {4294966966, 0}) = 0
gettimeofday({1293082231, 917876}, {4294966966, 0}) = 0
gettimeofday({1293082231, 917916}, {4294966966, 0}) = 0
select(4, [3], NULL, NULL, {10, 0}) = 1 (in [3], left {10, 0})
gettimeofday({1293082231, 917977}, {4294966966, 0}) = 0
recv(3, "/\210,\215\0(\1\23\0\0\25O\0\0\0\0\0\0\0\0", 20, 0) = 20
recv(3, "TLicense\0\0\0K\232\0\0\0\0\0TD", 20, 0) = 20
shutdown(3, 2 /* send and receive */) = 0
close(3) = 0
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
stat64("/home/hindustan.psbuild/.flexlmrc", {st_mode=S_IFREG|0664, st_size=99, ...}) = 0
gettimeofday({1293082231, 918341}, NULL) = 0
open("/etc/resolv.conf", O_RDONLY) = 4
fstat64(4, {st_mode=S_IFREG|0644, st_size=25, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fa2000
read(4, "nameserver 10.100.200.55\n", 4096) = 25
read(4, "", 4096) = 0
close(4) = 0
munmap(0xb7fa2000, 4096) = 0
uname({sys="Linux", node="VNL-HINDUSTAN", ...}) = 0
socket(PF_FILE, SOCK_STREAM, 0) = 4
fcntl64(4, F_GETFL) = 0x2 (flags O_RDWR)
fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = 0
poll([{fd=4, events=POLLOUT|POLLERR|POLLHUP, revents=POLLOUT}], 1, 5000) = 1
send(4, "\2\0\0\0\r\0\0\0\6\0\0\0hosts\0\256\0", 20, MSG_NOSIGNAL) = 20
poll([{fd=4, events=POLLIN|POLLERR|POLLHUP, revents=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1
recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"hosts\0", 6}], msg_controllen=16, {cmsg_len=16, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, {5}}, msg_flags=0}, 0) = 6
fstat64(5, {st_mode=S_IFREG|0600, st_size=217016, ...}) = 0
pread64(5, "\1\0\0\0h\0\0\0\232\4\0\0\1\0\0\0\\\"\17M\0\0\0\0\323\0"..., 104, 0) = 104
mmap2(NULL, 217016, PROT_READ, MAP_SHARED, 5, 0) = 0xb7f6e000
close(5) = 0
close(4) = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV (core dumped) +++
Process 22137 detached

business_kid 02-25-2011 08:49 AM

--- SIGSEGV (Segmentation fault) @ 0 (0) ---

Is that trying to address ram address 0x0000? It dumped a core so you can play with gdb, but Segmentation faults are memory faults. If the memory is good (Is it?) then it's a genuine crash - implying faulty software, disk, or possibly source.

Even handier than memtest86 often is to try an up arrow & return. If the compile continues any distance, you have a memory problem.

vivek677 02-27-2011 11:09 PM

Thanks for your response. Will try to debug further.

quanta 02-27-2011 11:19 PM

Quote:

[generate] Segmentation fault (core dumped)
Find out where it was saved and try to debug with gdb:
Code:

gdb /path/to/your_app core_file.pid

vivek677 03-02-2011 10:48 PM

Thanks Quanta.

I have already tried that but not able to debug; below is the o/p of gdb

>gdb generate core.8630
GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...generate: No such file or directory.

Core was generated by ` hindustan.psbuild x /home/hindustan.psbuild /bin/bash '.
Program terminated with signal 11, Segmentation fault.
#0 0x00a29a2c in ?? ()
(gdb) bt
#0 0x00a29a2c in ?? ()
#1 0x0191ce77 in ?? ()
#2 0xbfff7ae4 in ?? ()
#3 0xbfff6f2c in ?? ()
#4 0x019451d2 in ?? ()
#5 0xbfff6ec0 in ?? ()
#6 0xbfff6f2c in ?? ()
#7 0x00000c00 in ?? ()
#8 0xbfff6ebc in ?? ()
#9 0x51d3640a in ?? ()
#10 0x00000000 in ?? ()


Could you help debug it further.

Thanks!

quanta 03-03-2011 12:04 AM

Oh, I don't know why did you get all of "??" in the backtrace info.

Did you notice of that line:
Quote:

This GDB was configured as "i386-redhat-linux-gnu"...generate: No such file or directory.
Is it full or you use etctera?

vivek677 03-03-2011 01:12 AM

Since process does not complete; target generate is not generated. But since app name is required for gdb i just mentioned it (just a dummy name).

From my debugging till now, it seems there is some issue with /var/run/nscd/socket, since removing this file and killing nscd process solves the problem for me.

But i am not sure what could be its repercussions. I need to further see why nscd process is creating problem what would be the impact if I don't run it on the system.

Anyway, thanks again to Quanta and Business_Kid for the your help.

Regards,
Vivek


All times are GMT -5. The time now is 11:49 PM.