LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices



Reply
 
Search this Thread
Old 01-13-2009, 11:22 AM   #1
landysaccount
Member
 
Registered: Sep 2008
Location: Dominican Republic
Distribution: Debian Squeeze
Posts: 177

Rep: Reputation: 17
segfault error


Hello;

Today I've come across a big problem (to me): My router/firewall running debian Etch 4.0 with 2.6-28 kernel got locked. I could ping to it and ping to the internet but, I couldn't connect to it. I turned on the monitor and I see the following:

I tried to login:

login[1879]: segfault at 0 ip b7ec346d sp bf0e060 error 4 in libpam.so.0.79[b7ebf000+7000]

tried to reboot with ctrl-alt-del:

shutdown[2883]: segfault at bff4327d ip b7e27141 sp bfd1421c error 4 in libc-2.3.6.so[b7dc8000+127000]

Just locked. I pressed the reset button and got a whole bunch of segfault error and it didn't come up.

I have no idea of what's going on.

What's wrong?
 
Old 01-13-2009, 06:23 PM   #2
unSpawn
Moderator
 
Registered: May 2001
Posts: 27,766
Blog Entries: 54

Rep: Reputation: 2976Reputation: 2976Reputation: 2976Reputation: 2976Reputation: 2976Reputation: 2976Reputation: 2976Reputation: 2976Reputation: 2976Reputation: 2976Reputation: 2976
I don't know. Could boot a Live CD, mount partitions readonly and try to figure out from reading logs if something got installed, de-installed, updated, reconfigured, et cetera recently?
 
Old 01-13-2009, 06:55 PM   #3
GaijinPunch
Member
 
Registered: Aug 2003
Location: Tokyo, Japan
Distribution: Gentoo
Posts: 130

Rep: Reputation: 22
Are you logging in on the console or via SSH?
 
Old 01-13-2009, 09:11 PM   #4
landysaccount
Member
 
Registered: Sep 2008
Location: Dominican Republic
Distribution: Debian Squeeze
Posts: 177

Original Poster
Rep: Reputation: 17
Quote:
Originally Posted by GaijinPunch View Post
Are you logging in on the console or via SSH?
I can't login at all.

Could it be a hardware problem?

Last edited by landysaccount; 01-13-2009 at 09:14 PM.
 
Old 01-13-2009, 10:20 PM   #5
GaijinPunch
Member
 
Registered: Aug 2003
Location: Tokyo, Japan
Distribution: Gentoo
Posts: 130

Rep: Reputation: 22
Quote:
I tried to login:
How did you *try* to login?

Quote:
Could it be a hardware problem?
Most definitely could be. Segfaults generally occur when a program tries to access an ass piece of memory. My latest debacle was with brand new hardware. Couldn't compile anything that took more than a few minutes w/o a segmentation fault. Ran memtest86 for a full day w/ no errors. Swapped the memory out: viola - problem solved.

On that note, answer the obvious following questions:
1: Has any software changed?
2: Has any hardware changed?

If either is yes, investigate there. If not, as suggested by unSpawn, you should try to boot to a LiveCD, mount the drive, and look for hints in logs. If that doesn't work, you can try swapping out hardware to pinpoint it. As hinted, I would start with memory. It's cheap these days, and easy to do. To really test it, I would compile the kernel in an endless loop. I got a script somewhere (forgot where... I think gentoo forums) that id it and exited when there was an error.

Last edited by GaijinPunch; 01-13-2009 at 10:22 PM.
 
Old 01-14-2009, 03:57 AM   #6
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,005
Blog Entries: 11

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
Quote:
Originally Posted by landysaccount View Post
I can't login at all.

Could it be a hardware problem?
If you didn't have any updates - yes, by all means.
I've seen segfaults as an indicator for both a buggy/dieing
chip-set, and for dieing RAMs.


Cheers,
Tink
 
Old 01-14-2009, 03:05 PM   #7
landysaccount
Member
 
Registered: Sep 2008
Location: Dominican Republic
Distribution: Debian Squeeze
Posts: 177

Original Poster
Rep: Reputation: 17
Hello.

I forgot about the problem and didnt want to stress over it. Now I came home, plugged in the box turned on, and it boot up like a charm.

I don't know what magic solved the problem but, I will leave it on to see if the problem happens again. I was thinking of replacing the HD and the RAM and reinstall debian and see if it will work flawlessly. I don't know how to recreate the problem since I don't know how and when it happened but, is frustrating.

I will take a look at /var/log/messages and some other logs to see if I come across something there.

What you guys recommend me to do with this piece of crap?

Thank you.
 
Old 01-14-2009, 03:27 PM   #8
landysaccount
Member
 
Registered: Sep 2008
Location: Dominican Republic
Distribution: Debian Squeeze
Posts: 177

Original Poster
Rep: Reputation: 17
After looking at my /var/log/messages file I see this:

Jan 12 22:20:01 trahersa-test squid[1719]: Squid Parent: child process 2665 exited due to signal 6
Jan 12 22:20:01 trahersa-test kernel: squid[1719]: segfault at 48100813 ip 48100813 sp bff5dfbc error 4 in libnss_files-2.3.6.so[b7c44000+9000]
Jan 12 22:48:14 trahersa-test -- MARK --
Jan 12 22:54:57 trahersa-test dhcpd: Wrote 5 leases to leases file.
Jan 12 23:08:15 trahersa-test -- MARK --
Jan 12 23:28:15 trahersa-test -- MARK --
Jan 12 23:48:15 trahersa-test -- MARK --
Jan 13 00:08:15 trahersa-test -- MARK --
Jan 13 00:28:15 trahersa-test -- MARK --
Jan 13 00:48:15 trahersa-test -- MARK --
Jan 13 01:08:15 trahersa-test -- MARK --
Jan 13 01:28:15 trahersa-test -- MARK --
Jan 13 01:39:50 trahersa-test dhcpd: Wrote 5 leases to leases file.
Jan 13 02:08:15 trahersa-test -- MARK --
Jan 13 02:18:15 trahersa-test kernel: exim4[1605]: segfault at 81f2b28 ip 0805b3dd sp bfed9f00 error 4 in exim4[8048000+a5000]
Jan 13 02:28:15 trahersa-test -- MARK --
Jan 13 02:48:16 trahersa-test -- MARK --
Jan 13 03:08:16 trahersa-test -- MARK --
Jan 13 03:28:16 trahersa-test -- MARK --
Jan 13 03:48:16 trahersa-test -- MARK --
Jan 13 04:08:16 trahersa-test -- MARK --
Jan 13 04:09:01 trahersa-test kernel: cron[2830]: segfault at 1a ip b7f0c08c sp bff2faac error 4 in libpam.so.0.79[b7f0b000+7000]
Jan 13 04:17:01 trahersa-test kernel: cron[2831]: segfault at 1a ip b7f0c08c sp bff2faac error 4 in libpam.so.0.79[b7f0b000+7000]
Jan 13 04:28:16 trahersa-test -- MARK --
Jan 13 04:39:01 trahersa-test kernel: cron[2832]: segfault at 1a ip b7f0c08c sp bff2faac error 4 in libpam.so.0.79[b7f0b000+7000]
Jan 13 05:02:01 trahersa-test kernel: cron[2833]: segfault at 10000 ip b7f0f48a sp bff2ff20 error 4 in libpam.so.0.79[b7f0b000+7000]
Jan 13 05:09:01 trahersa-test kernel: cron[2834]: segfault at 10000 ip b7f0f48a sp bff2ff20 error 4 in libpam.so.0.79[b7f0b000+7000]
Jan 13 05:17:01 trahersa-test kernel: cron[2835]: segfault at 10000 ip b7f0f48a sp bff2ff20 error 4 in libpam.so.0.79[b7f0b000+7000]
Jan 13 05:28:16 trahersa-test -- MARK --
Jan 13 05:39:01 trahersa-test kernel: cron[2836]: segfault at 10000 ip b7f0f48a sp bff2ff20 error 4 in libpam.so.0.79[b7f0b000+7000]
Jan 13 06:08:16 trahersa-test -- MARK --
Jan 13 06:09:01 trahersa-test kernel: cron[2837]: segfault at 10000 ip b7f0f48a sp bff2ff20 error 4 in libpam.so.0.79[b7f0b000+7000]
Jan 13 06:17:01 trahersa-test kernel: cron[2838]: segfault at 10000 ip b7f0f48a sp bff2ff20 error 4 in libpam.so.0.79[b7f0b000+7000]
Jan 13 06:25:01 trahersa-test kernel: cron[2839]: segfault at 10000 ip b7f0f48a sp bff2ff20 error 4 in libpam.so.0.79[b7f0b000+7000]
Jan 13 06:39:01 trahersa-test kernel: cron[2840]: segfault at 10000 ip b7f0f48a sp bff2ff20 error 4 in libpam.so.0.79[b7f0b000+7000]
Jan 13 07:08:17 trahersa-test -- MARK --
Jan 13 07:09:01 trahersa-test kernel: cron[2841]: segfault at 10000 ip b7f0f48a sp bff2ff20 error 4 in libpam.so.0.79[b7f0b000+7000]
Jan 13 07:17:01 trahersa-test kernel: cron[2842]: segfault at 10000 ip b7f0f48a sp bff2vf20 error 4 in |ibpam.so.0.79[b7f0b000+7000]
Jan 13 07:28:17 trahersa-t<E5>st -- MARK --
Jan 13 07:39:01 trahersa-test kernel: cron[2843]: segfault at 10000 ip b7f0f48a sp bff2ff20 error 4 in libpam.so.0.79[b7f0b000+7000]
Jan 13 08:08:17 trahersa-test -- MARK --
Jan 13 08:0;:01 trahersa-test kernel: cron[2844]: segfault at b7f116ae ip b7f0e420 sp bff30030 error 7 in lybpam.so>0.79[b7f0b000+7000]
Jan 13 08:17:01 trahersa-test kernel: cron[2845]: sugfault at b7f116ae ip b7f0e420 sp bff30030 error 7 in libpam.so.0.79[b7f0b000+7000]
...
...
..
..
Jan 13 11:03:05 trahersa-test kernel: login[1879]: segfault at 0 ip b7ec346d sp bff0e060 error 4 in libpam.so.0.79[b7ebf000+7000]
Jan 13 11:04:49 trahersa-test kernel: login[2857]: segfault at 0 ip b7ed246d sp bfd1c670 error 4 in libpam.so.0.79[b7ece000+7000]
Jan 13 11:04:55 trahersa-test kernel: login[1887]: segfault at 0 ip b7f4146d sp bf98dae0 error 4 in libpam.so.0.79[b7f3d000+7000]
Jan 13 11:04:59 trahersa-test kernel: shutdown[2883]: segfault at bff4327d ip b7e27141 sp bfd1421c error 4 in libc-2.3.6.so[b7dc8000+127000]
Jan 13 11:06:44 trahersa-test kernel: login[2874]: segfault at 0 ip b7eee46d sp bfe38790 error 4 in libpam.so.0.79[b7eea000+7000]
Jan 13 11:06:50 trahersa-test kernel: login[2884]: segfault at 0 ip b7f9d46d sp bfce7e40 error 4 in libpam.so.0.79[b7f99000+7000]
Jan 13 11:06:56 trahersa-test shutdown[2902]: shutting down for system reboot
Jan 13 11:06:56 trahersa-test kernel: rc[2905]: segfault at 3131d807 ip 3131d807 sp bfacdf20 error 4 in libnsl-2.3.6.so[b7e1a000+12000]
Jan 13 11:06:56 trahersa-test kernel: sulogin[2907]: segfault at 9 ip b7e94d11 sp bff7440c error 4 in libc-2.3.6.so[b7de4000+127000]
Jan 13 11:07:03 trahersa-test kernel: sulogin[2929]: segfault at 0 ip b7f89218 sp bfc955f8 error 4 in ld-2.3.6.so[b7f7f000+15000]
Jan 13 11:07:07 trahersa-test kernel: sulogin[2946]: segfault at 0 ip b7fe4466 sp bfeef990 error 6 in ld-2.3.6.so[b7fd9000+15000]
Jan 13 11:07:07 trahersa-test kernel: udevd[2947]: segfault at fc ip ffffe4ab sp bfacfd94 error 6






Can anyone figure something out after seeing that?

Last edited by landysaccount; 01-14-2009 at 03:39 PM.
 
Old 01-14-2009, 04:23 PM   #9
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,005
Blog Entries: 11

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
Open the case and remove dust from the motherboard ;}

Could be thermal issues that cause RAM (and/or chips) to
start failing after some time of running.



Cheers,
Tink
 
Old 01-14-2009, 06:11 PM   #10
landysaccount
Member
 
Registered: Sep 2008
Location: Dominican Republic
Distribution: Debian Squeeze
Posts: 177

Original Poster
Rep: Reputation: 17
This is a brand new system and is clean, no dust.
 
Old 01-14-2009, 06:58 PM   #11
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,005
Blog Entries: 11

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
In that case it's a thermal fault of some sort; take it to the shop
and have them check it. If it were a software related fault it would
be consistent, not good at boot and broken later.
 
Old 01-15-2009, 06:50 PM   #12
landysaccount
Member
 
Registered: Sep 2008
Location: Dominican Republic
Distribution: Debian Squeeze
Posts: 177

Original Poster
Rep: Reputation: 17
Ok. It happened again. I left the system on and today approximately 24hrs later I got a segfault. This time is different. I was logged with putty and that connection didn't close. I was able to do other things in the system and tried to login with another putty session but just couldn't. It allowed me to type the username and password and after pressing enter it closed the window.

I rebooted and it came up alright. It booted again.

I noticed while booting a message:
rtc_cmos
rtc0: alarm up to one day.


I don't know what that means. I googled it and is something about acpid, which I don't have installed and is also disabled in the BIOS.

What else might be causing this problem? Looks like is hardware. Could it be a thermal issue like Tinkster mentioned?
 
Old 01-15-2009, 07:52 PM   #13
GaijinPunch
Member
 
Registered: Aug 2003
Location: Tokyo, Japan
Distribution: Gentoo
Posts: 130

Rep: Reputation: 22
I would highly recommend replacing the memory. It's cheap, and easy to do. If it's not that, the worst case is you have some extra memory (and in most cases, can use it somewhere). This definitely points to faulty hardware (of some sort). As pointed out, software related segfaults are consistent -- this is very inconsistent, and from your log, different applications are segfaulting.

Interested to hear the outcome.
 
Old 01-16-2009, 12:14 PM   #14
landysaccount
Member
 
Registered: Sep 2008
Location: Dominican Republic
Distribution: Debian Squeeze
Posts: 177

Original Poster
Rep: Reputation: 17
Ok.

I took the box to the shop and technician there recommended replacing the CPU and fan, they replaced it for free. I will let the box run for a while to see if the problem happens again. If it does, I shall replace the memory.

I will keep you posted.
 
Old 01-16-2009, 12:17 PM   #15
landysaccount
Member
 
Registered: Sep 2008
Location: Dominican Republic
Distribution: Debian Squeeze
Posts: 177

Original Poster
Rep: Reputation: 17
But, what does this really mean anyways:

rtc_cmos
rtc0: alarm up to one day.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Segfault on several software Vassos Linux - Software 5 09-20-2008 08:16 AM
Segfault in Samba kmoffat Linux - Networking 1 09-07-2006 08:16 AM
lsmod segfault? z-vet Yoper 6 12-15-2004 04:40 PM
SDL Error: /dev/snd/pcmC0D0p - "busy" = segfault tireseas Linux - Software 0 07-18-2004 12:35 PM
SW9.1 (crnt) ATI 3.2.8 (gcc 3.3.3?) segfault error Aeiri Linux - Hardware 1 05-27-2004 04:56 PM


All times are GMT -5. The time now is 09:34 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration