LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 04-17-2011, 11:55 AM   #1
anon091
Senior Member
 
Registered: Jun 2009
Posts: 1,795

Rep: Reputation: 49
Server unresponsive, dont see anything in logs


I had a server that wasnt pingable, and the console was just black so had to be hard powered down.

Looking in the messages files, these were the last three things, 1154am was when i brought it back up.

Apr 17 01:04:26 servername ntpd[25230]: kernel time discipline status change 1
Apr 17 04:02:04 servername syslogd 1.4.1: restart.
Apr 17 11:54:46 servername syslogd 1.4.1: restart.

should i be looking somewhere else to figure out what happened?
 
Old 04-18-2011, 02:49 AM   #2
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608
// Since you have posted many threads by now regarding one or more problematic servers it would be good to add references if this applies to one of those mentioned before IMO.

That said your log looks too clean unless the machine saw no activity at all and no subsystems and services log to syslog or syslog was dead for some time. If the syslog restarts correspond with any log rotation (check /etc/cron.*, /var/log/cron, /etc/logrotate.d?, /var/lib/logrotate.status) do check the /var/log/[logfilename].1(.gz?) in addition to /var/log/[logfilename]. A quick way to get an overview and find any anomalies could be to run Logwatch with the --archives and --range Yesterday on all logs. Do check 'lastlog; lastb; last' for any logins just in case. Apart from /var/log/{kernel,messages,secure,cron,maillog} whatever logs you check depends on what services the machine provides. Be sure to reply verbosely if you find anything "interesting".
 
Old 04-18-2011, 06:28 AM   #3
anon091
Senior Member
 
Registered: Jun 2009
Posts: 1,795

Original Poster
Rep: Reputation: 49
hey unSpawn, it is one of those servers, it just has samba shares pushing 2TB of data.

The server is down now running file system checks, so i'll check those logs back and see if i can find anything "interesting" when its back up.
 
Old 04-19-2011, 04:52 AM   #4
anon091
Senior Member
 
Registered: Jun 2009
Posts: 1,795

Original Poster
Rep: Reputation: 49
The server is finally back up, going to look at the logs now.

I do notice when looking in /proc/mdstat, it shows the following. I'm guessing when my system went down hard, something got messed up, but I dont understand what exactly its doing by resyncing. All my data on md0 and md1 appears to be there. Does anyone know what its doing exactly?

Personalities : [raid1]
read_ahead 1024 sectors
Event: 2
md0 : active raid1 sdb1[1] sda1[0]
859549696 blocks [2/2] [UU]
resync=DELAYED
md1 : active raid1 sdb2[1] sda2[0]
859533632 blocks [2/2] [UU]
[>....................] resync = 2.1% (18808576/859533632) finish=1404.8min speed=9971K/sec
unused devices: <none>

---------- Post added 04-19-11 at 05:53 AM ----------

Also, what happens if people start adding data to the server before these resyncings finish? will that cause problems?
 
Old 04-19-2011, 09:30 AM   #5
anon091
Senior Member
 
Registered: Jun 2009
Posts: 1,795

Original Poster
Rep: Reputation: 49
looking through my cron log, i see activity (i have a script that runs every minute) that leads me to believe the server was up till Apr 17 04:29:01, as thats the last event till after i brought it back up at 11:54. nothing in the samba log other than samba restarting around 11:54, nothing in secure, dont have a kernel log to look in, maillog doesn't show anything useful from what i can tell.
 
Old 04-19-2011, 05:48 PM   #6
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608
As you haven't confirmed cron schedule, log rotation, existence of and contents of logrotated logs, anything Logwatch could have reported, auth database records, nor any nfo wrt services the machine provides there is literally nothing for me to diagnose or help you with. Bummer. I suggest you set up watches like atop process logging, syslog markers to at least get a "ping" of it being alive and maybe run Monit or Nagios to keep a watch on the machine and or services.
 
Old 04-20-2011, 09:52 AM   #7
anon091
Senior Member
 
Registered: Jun 2009
Posts: 1,795

Original Poster
Rep: Reputation: 49
The only things cron'd on that server that would have ran around that time is the task i mentioned in that log post. I checked all the cron files as you suggested. Doesnt the log rotate just rotate the log files within /var/log, which i looked through them all already in my previous post.

As I stated in my first reply to you when you asked what server, this server is just a samba server, which also has mail to send some stuff out on occasion. Guess you were looking for more specifics as to what all I checked? Also, I've never used logwatch before so not sure how to go about doing that. But I did look through every log in /var/log manually for any "interesting" stuff as you first suggested. I was the last person to log onto the server before it happened, and all I did was a df on it.

When the server was just at a black screen, it wouldn't respond to keystrokes at the console, to SSH connections, or to pings. I forgot to mention SSH in my OP.

I guess I'm not understanding what info I haven't provided, other than whatever logwatch can do. I still only use linux minimally, i'm not trying to be difficult, sorry if this is bumming you out. It really just looks like "something" died and the server stopped logging events, at least in the places I know of to check.

I'm really not trying to be a pain here, which based on your last response i'm guessing I am.
 
Old 04-20-2011, 09:54 AM   #8
anon091
Senior Member
 
Registered: Jun 2009
Posts: 1,795

Original Poster
Rep: Reputation: 49
I'll just open another thread to see if someone can help me understand that /proc/mdstat stuff, as I dont get what its actually resync'ing, dont feel like you need to reply any more unspawn, you've already gone above and beyond.
 
Old 04-20-2011, 04:32 PM   #9
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608Reputation: 3608
No, that's not it, and I missed your comment about the basic marks cron leaves for the per-minute cronjob, sorry. Problem is Linux doesn't come with extensive auditing features enabled out of the box to enable you to troubleshoot less common problems easily. If the machine is comparable with another one in terms of HW / distro / release / userland SW then you could diff them and see if there's any clue in that else you may need additional tools to help you troubleshoot the issue. Watching service processes and logging connectivity from and towards the machine could be a first step.
 
Old 04-20-2011, 04:52 PM   #10
anon091
Senior Member
 
Registered: Jun 2009
Posts: 1,795

Original Poster
Rep: Reputation: 49
Yeah, its weird, its like the system just stopped doing everything, at least from what i can tell. its an older server, was just hoping there might be something somewhere that might lead me to believe if its hardware failure or what. I appreciate all the help with this.
 
Old 04-24-2011, 10:29 AM   #11
anon091
Senior Member
 
Registered: Jun 2009
Posts: 1,795

Original Poster
Rep: Reputation: 49
unspawn, it just happened again, 2nd week in a row. can you give me some advice on how to check that other stuff i dont know. this is way over my head at this point. :'(
 
Old 04-25-2011, 08:03 AM   #12
anon091
Senior Member
 
Registered: Jun 2009
Posts: 1,795

Original Poster
Rep: Reputation: 49
OK, looking in messages and messages.1, this is the last stuff that shows up before I brought the server back up shortly after noon. This time looking in my cron log though, which logs an event every minute, the last event is at 9:02, then at 12:09:30 which is from when cron started. So I guess this time it didn't actually die till after 9:02, which is different then last time based on the cron logs.

Apr 24 04:02:04 ServerNameReplaced nmbd[2212]: [2011/04/24 04:02:04, 0] nmbd/nmbd.crocess(542)
Apr 24 04:02:04 ServerNameReplaced syslogd 1.4.1: restart.
Apr 24 04:02:04 ServerNameReplaced nmbd[2212]: Got SIGHUP dumping debug info.
Apr 24 04:02:04 ServerNameReplaced nmbd[2212]: [2011/04/24 04:02:04, 0] nmbd/nmbd_workgroupdb.c:dump_workgroups(284)
Apr 24 04:02:04 ServerNameReplaced nmbd[2212]: dump_workgroups()
Apr 24 04:02:04 ServerNameReplaced nmbd[2212]: dump workgroup on subnet 10.1.2.34: netmask= 255.255.254.0:
Apr 24 04:02:04 ServerNameReplaced nmbd[2212]: US(1) current master browser = UNKNOWN
Apr 24 04:02:04 ServerNameReplaced nmbd[2212]: ServerNameReplaced 40009a03 (Samba_photo1)
Apr 24 12:09:31 ServerNameReplaced syslogd 1.4.1: restart.
 
Old 04-25-2011, 12:47 PM   #13
anon091
Senior Member
 
Registered: Jun 2009
Posts: 1,795

Original Poster
Rep: Reputation: 49
ok, think i figured out what you meant in post #2. if i look at /etc/crontab, the cron.daily runs at 4:02 every day, and within cron.daily is a logrotate. I dont see anything unusual in the cron log as i mentioned in my last post, other than the timing is later than it was last sunday.

Does anyone have any ideas where to go next with this? I'm totally at a loss.
 
Old 04-25-2011, 01:21 PM   #14
anon091
Senior Member
 
Registered: Jun 2009
Posts: 1,795

Original Poster
Rep: Reputation: 49
and it just happened again, guess it isnt an every sunday thing then...
 
Old 04-26-2011, 05:07 AM   #15
anon091
Senior Member
 
Registered: Jun 2009
Posts: 1,795

Original Poster
Rep: Reputation: 49
ok, actually found something out, i was able to look at the console as it was in the middle of going down, it keeps repeating these messages on the screen

aacraid: Host adapter reset request
percraid: Host adapter dead -3
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Debian server becomes unresponsive while running java. annihilan Linux - Server 2 03-14-2011 06:26 PM
Confused as to why my server becomes unresponsive andrew2110 Linux - Server 7 12-22-2009 08:03 AM
Finding LDAP Server Logs / Application Logs in Linux arbignay Linux - Newbie 2 03-24-2008 09:54 AM
server becomes unresponsive davy2002a Linux - Server 4 11-28-2007 11:42 PM
Server unresponsive to connection attempts dcaillouet Linux - Networking 4 07-06-2001 10:04 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 10:22 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration