LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 09-03-2009, 11:29 PM   #1
eldhochacko
LQ Newbie
 
Registered: Sep 2009
Location: kochi
Posts: 12

Rep: Reputation: 0
how can i find when server goes down?


When i checked my server in today morning,i found server was hang.
How can i find whether it is hang or not?
Where can i find the log from which i can find the exact time?
 
Old 09-03-2009, 11:30 PM   #2
JulianTosh
Member
 
Registered: Sep 2007
Location: Las Vegas, NV
Distribution: Fedora / CentOS
Posts: 674
Blog Entries: 3

Rep: Reputation: 90
You'll need another server to monitor it. Usually via pings or some other kind of service test.

Check /var/log/messages for the last message to see if there's anything useful.
 
Old 09-03-2009, 11:34 PM   #3
eldhochacko
LQ Newbie
 
Registered: Sep 2009
Location: kochi
Posts: 12

Original Poster
Rep: Reputation: 0
I had checked that command,but i couldnt find the time by which i could find the exact time by which server was down or hang.
 
Old 09-03-2009, 11:35 PM   #4
JulianTosh
Member
 
Registered: Sep 2007
Location: Las Vegas, NV
Distribution: Fedora / CentOS
Posts: 674
Blog Entries: 3

Rep: Reputation: 90
If the server freezes... there's not much you can do about that. Check application logs for the last timestamped entry to guestimate when it went down - and perhaps why too.

That's why it's nice to have another system monitoring so you can get a timeline of it's state.. memory usage, cpu usage, disk usage, etc
 
Old 09-04-2009, 12:15 AM   #5
eldhochacko
LQ Newbie
 
Registered: Sep 2009
Location: kochi
Posts: 12

Original Poster
Rep: Reputation: 0
Hi beotch,

thanks for ur replay,

i sending my server System log details.I have manually restarted server on 3rd Septemper morning.

But upto sep 1 , 3 O'clock AM we r working on our appllication ,

But it's not update in message log,

What r possbility for missing the log in sep 1 & sep 2


Aug 31 00:08:46 cginq01 last message repeated 3 times
Aug 31 07:04:12 cginq01 last message repeated 3 times
Aug 31 08:17:49 cginq01 last message repeated 3 times
Aug 31 11:45:44 cginq01 last message repeated 2 times
Aug 31 12:18:49 cginq01 last message repeated 3 times
Aug 31 12:29:36 cginq01 last message repeated 3 times
Aug 31 12:31:58 cginq01 last message repeated 3 times
Aug 31 12:35:56 cginq01 last message repeated 3 times
Aug 31 12:42:09 cginq01 last message repeated 2 times
Aug 31 12:44:12 cginq01 last message repeated 3 times
Aug 31 12:56:44 cginq01 last message repeated 3 times
Aug 31 12:58:47 cginq01 last message repeated 3 times
Aug 31 13:16:35 cginq01 last message repeated 3 times
Aug 31 13:18:23 cginq01 last message repeated 3 times
Aug 31 13:22:10 cginq01 last message repeated 3 times
Aug 31 13:24:31 cginq01 last message repeated 3 times
Aug 31 13:30:31 cginq01 last message repeated 3 times
Aug 31 15:24:28 cginq01 last message repeated 3 times
Aug 31 15:35:20 cginq01 last message repeated 3 times
Aug 31 16:01:04 cginq01 last message repeated 2 times
Aug 31 17:13:44 cginq01 last message repeated 3 times
Aug 31 17:18:30 cginq01 last message repeated 3 times
Sep 3 10:07:10 cginq01 syslogd 1.4.1: restart.
Sep 3 10:07:10 cginq01 audispd: starting audispd
Sep 3 10:07:10 cginq01 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Sep 3 10:07:10 cginq01 kernel: Linux version 2.6.18-53.el5 (brewbuilder@hs20-bc1-7.build.redhat.com) (gcc version 4.1.2 2007
0626 (Red Hat 4.1.2-14)) #1 SMP Wed Oct 10 16:34:19 EDT 2007
Sep 3 10:07:10 cginq01 kernel: Command line: ro root=LABEL=/ rhgb quiet
Sep 3 10:07:10 cginq01 kernel: BIOS-provided physical RAM map:
Sep 3 10:07:10 cginq01 kernel: BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
Sep 3 10:07:10 cginq01 kernel: BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
Sep 3 10:07:10 cginq01 kernel: BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
Sep 3 10:07:10 cginq01 kernel: BIOS-e820: 0000000000100000 - 00000000bffcb440 (usable)
Sep 3 10:07:10 cginq01 kernel: BIOS-e820: 00000000bffcb440 - 00000000bffceac0 (ACPI data)
Sep 3 10:07:10 cginq01 kernel: BIOS-e820: 00000000bffceac0 - 00000000c0000000 (reserved)
Sep 3 10:07:10 cginq01 kernel: BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
Sep 3 10:07:10 cginq01 kernel: BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
Sep 3 10:07:10 cginq01 kernel: BIOS-e820: 0000000100000000 - 0000000140000000 (usable)
Sep 3 10:07:10 cginq01 kernel: DMI 2.4 present.
Sep 3 10:07:10 cginq01 rpc.statd[2805]: Version 1.0.9 Starting
Sep 3 10:07:10 cginq01 kernel: SRAT: PXM 0 -> APIC 0 -> Node 0
Sep 3 10:07:10 cginq01 kernel: SRAT: PXM 0 -> APIC 1 -> Node 0
Sep 3 10:07:10 cginq01 kernel: SRAT: PXM 0 -> APIC 2 -> Node 0
Sep 3 10:07:10 cginq01 kernel: SRAT: PXM 0 -> APIC 3 -> Node 0
Sep 3 10:07:10 cginq01 kernel: SRAT: Node 0 PXM 0 0-c0000000
Sep 3 10:07:10 cginq01 kernel: SRAT: Node 0 PXM 0 0-140000000
Sep 3 10:07:10 cginq01 kernel: SRAT: Node 0 PXM 0 0-1000000000
Sep 3 10:07:10 cginq01 kernel: SRAT: hot plug zone found 140000000 - 1000000000
Sep 3 10:07:10 cginq01 kernel: SRAT: Hotplug region ignored
Sep 3 10:07:10 cginq01 kernel: Bootmem setup node 0 0000000000000000-0000000140000000
Sep 3 10:07:10 cginq01 kernel: Memory for crash kernel (0x0 to 0x0) notwithin permissible range
Sep 3 10:07:10 cginq01 kernel: disabling kdump
Sep 3 10:07:10 cginq01 rpc.statd[2805]: statd running as root. chown /var/lib/nfs/statd/sm to choose different user
Sep 3 10:07:10 cginq01 kernel: ACPI: PM-Timer IO Port: 0x588
Sep 3 10:07:10 cginq01 kernel: ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Sep 3 10:07:10 cginq01 kernel: Processor #0 6:15 APIC version 20
Sep 3 10:07:10 cginq01 kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Sep 3 10:07:10 cginq01 kernel: Processor #1 6:15 APIC version 20
Sep 3 10:07:10 cginq01 kernel: ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
Sep 3 10:07:10 cginq01 kernel: Processor #2 6:15 APIC version 20
Sep 3 10:07:10 cginq01 kernel: ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
Sep 3 10:07:10 cginq01 kernel: Processor #3 6:15 APIC version 20
Sep 3 10:07:10 cginq01 kernel: ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])


Best Regards
Chacko
 
Old 09-04-2009, 12:21 AM   #6
JulianTosh
Member
 
Registered: Sep 2007
Location: Las Vegas, NV
Distribution: Fedora / CentOS
Posts: 674
Blog Entries: 3

Rep: Reputation: 90
I'm gonna say your server went down shortly after Aug 31 17:18:30.

go back further... want to see what message was being repeated so many times.

Also, what kind of server is it? web server? tail the httpd logs and post them too if it is.

Last edited by JulianTosh; 09-04-2009 at 12:22 AM.
 
Old 09-04-2009, 12:28 AM   #7
eldhochacko
LQ Newbie
 
Registered: Sep 2009
Location: kochi
Posts: 12

Original Poster
Rep: Reputation: 0
Hi Beotch,

thanks,

This is not WEB Server,which we r running SAP ECC6 on this server (Red Hat Linux 5)

But upto sep 1 , 3 O'clock AM we r working on SAP appllication ,

Best Regards
Chacko

Last edited by eldhochacko; 09-04-2009 at 12:35 AM.
 
Old 09-04-2009, 12:33 AM   #8
JulianTosh
Member
 
Registered: Sep 2007
Location: Las Vegas, NV
Distribution: Fedora / CentOS
Posts: 674
Blog Entries: 3

Rep: Reputation: 90
go back further in /var/log/messages and grab the log entries. We need to see what message was repeating so skip back until you see something other than "last message repeated..."
 
Old 09-04-2009, 01:35 AM   #9
JulianTosh
Member
 
Registered: Sep 2007
Location: Las Vegas, NV
Distribution: Fedora / CentOS
Posts: 674
Blog Entries: 3

Rep: Reputation: 90
Ah. ok then... well the evidence, if any, might be in another application log. Get a list of services/daemons together that are running on that machine (ssh, iptables, etc) and start going through their logs for the time that the primary service went down. If you get a list of those services, I can help you lookup where they typically store their own log files.

Also, I'd still be curious about what those repeated messages were.
 
Old 09-04-2009, 12:33 PM   #10
canyonbreeze
LQ Newbie
 
Registered: Apr 2009
Posts: 18

Rep: Reputation: Disabled
You can install Webmin. It has a function to notify you if your Apache, Postfix, MySQL, etc, goes down. My setup sends a message to my cell phone via email.
 
Old 09-04-2009, 04:35 PM   #11
JulianTosh
Member
 
Registered: Sep 2007
Location: Las Vegas, NV
Distribution: Fedora / CentOS
Posts: 674
Blog Entries: 3

Rep: Reputation: 90
I love webmin and that's an OK solution for monitoring local services, but it won't help if the server goes down hard. In this case, it needs to be monitored by a separate server and monitoring service that is immune to any volatile states the watched server/service is currently experiencing.

But if that's all you got, it's better than nothing! 8D

Last edited by JulianTosh; 09-04-2009 at 04:36 PM. Reason: clarification.
 
Old 09-23-2009, 11:46 PM   #12
eldhochacko
LQ Newbie
 
Registered: Sep 2009
Location: kochi
Posts: 12

Original Poster
Rep: Reputation: 0
Server Restarting problem

Hi Beotch;

again my Quality linux Server has been restarted yesterday morning.what was the correct reason for linux server restarting problem again and again?

Last edited by eldhochacko; 09-24-2009 at 12:02 AM.
 
Old 09-24-2009, 12:02 AM   #13
eldhochacko
LQ Newbie
 
Registered: Sep 2009
Location: kochi
Posts: 12

Original Poster
Rep: Reputation: 0
Hi Beotch;

again my Quality linux Server has been restarted yesterday morning.what was the correct reason for linux server restarting problem again and again?
 
Old 09-24-2009, 01:29 PM   #14
rsciw
Member
 
Registered: Jan 2009
Location: Essex (UK)
Distribution: Home: Debian/Ubuntu, Work: Ubuntu
Posts: 206

Rep: Reputation: 44
Quote:
Originally Posted by Admiral Beotch View Post
I love webmin and that's an OK solution for monitoring local services, but it won't help if the server goes down hard. In this case, it needs to be monitored by a separate server and monitoring service that is immune to any volatile states the watched server/service is currently experiencing.

But if that's all you got, it's better than nothing! 8D
Munin imo is also a nice monitoring tool.
Handy to see when something dies off via the graphs

(that said, I don't know webmin, but'll check it out)
 
Old 09-24-2009, 02:44 PM   #15
kschmitt
Member
 
Registered: Jul 2009
Location: Chicago Suburbs
Distribution: Crux, CentOS, RHEL, Ubuntu
Posts: 96

Rep: Reputation: 23
You need two things: a monitoring system, and a syslog server. One tells you when there's a problem, the other is used to diagnose the problem.

The monitoring system could be really simple, like putting a script in crontab that pings each server and email a list of which don't reply.

or

The monitoring system could be insanely complex and feature rich like Zenoss or Hypernic or something.

Now you need a syslog server! Syslog servers are really easy to setup, just google for setting up a syslog server in your favorite distro. What you want is for every server in your environment to log to that one syslog server. This is important so all the logs are in one place _outside_ of the server that's having problems. Then you can review the logs while the dead server is being rebuilt, or kept offline for security reasons.

When that's setup, what will happen is: your server logs what it's doing to your syslog server; something goes wrong, it writes it to syslog; the monitoring system alerts you that there was a problem; you go into your syslog server and read the logs to figure out what happened.

It's what I do here at work.

I've got a pretty large environment that I take care of (dev), and a more important, but smaller environment (production). Right now I'm monitoring both of them with zenoss, which is pretty cool, but honestly, for what I _really_ need, a pinging script would do fine. Dev and production each have their own syslog server, and prettymuch everything logs to one server or the other. When something goes wrong, Zenoss sends me an email. At that point I hop on syslog and see if it can tell me what happened (most of the time, it can).

Sidenote: All HP jetdirect cards can all do syslog! If you point a JD card to your syslog server, you have an easy way of knowing when something is going wrong with your printers. For instance, if I see more than a handful of paper-late jams on one printer in a day, I can be pretty sure the fuser is going.

--Kyle
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
DNS SERVER WITH ERROR: "Server Can't Find : SERVFAIL" jcvalim Linux - Server 52 05-21-2009 02:18 AM
LXer: Find out DNS Server Version With DNS Server Fingerprinting Tool LXer Syndicated Linux News 0 12-21-2007 05:30 PM
LXer: Find out DNS Server Version With DNS Server Fingeprinting Tool LXer Syndicated Linux News 0 12-21-2007 04:50 PM
can't find server jeffann Linux - Networking 1 12-31-2006 04:45 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 08:13 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration