LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Security
User Name
Password
Linux - Security This forum is for all security related questions.
Questions, tips, system compromises, firewalls, etc. are all included here.

Notices


Reply
  Search this Thread
Old 07-04-2006, 05:07 AM   #1
gn00kie
Member
 
Registered: Jan 2006
Posts: 70

Rep: Reputation: 15
Server hang up and need to fin root cause


Hi Guys,
One of our RHEL 4 server hanged last week. My boss is asking me a possible root cause of the hang up. I tried checking /var/log/messages but could not find anything there. There is a gap between the time where the server was restarted and the time that it was still ok. Where could I get more information to help me know the reason? A snippet of the logs is below for reference of what I am trying to explain. TIA.

Jul 1 09:08:01 somemachine crond(pam_unix)[28797]: session closed for user someuser
Jul 1 09:08:01 testmachine crond(pam_unix)[28800]: session closed for user someuser
Jul 1 09:45:44 somemachine syslogd 1.4.1: restart.
 
Old 07-04-2006, 06:34 AM   #2
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
Where is this box located on the network?
Are there any devices in front of it that log info?
Was /etc/syslog.conf changed or does it contain default values?
Are all logins "last" reports accounted for?
Does this box run any SAR?
What services does this box provide and who has access to them?
Have any daemons logged data in the period?
Have these hangs or log blackouts been happening before or not?
Is SW regularly updated?
Does "rpm -Va --noscripts" look OK?
Do you people keep a log of (admin) change reports for the box?

Any other things out of the ordinary you should mention?
 
Old 07-04-2006, 06:58 AM   #3
gn00kie
Member
 
Registered: Jan 2006
Posts: 70

Original Poster
Rep: Reputation: 15
Where is this box located on the network?
yes, this box is located on a network. it has a public ip and a private ip.

Are there any devices in front of it that log info?
none.

Was /etc/syslog.conf changed or does it contain default values?
it contains the default value.

Are all logins "last" reports accounted for?
yes, all logins are accounted for. there is also an entry where it shows that a user was connected from a certain time to a point where it crashed. e.g.

someuser pts/0 someip Sat Jul 1 09:46 - 10:11 (00:24)
reboot system boot 2.6.9-22.0.1.ELs Sat Jul 1 09:45 (1+21:32)
someuser pts/2 someip Sat Jul 1 06:17 - crash (03:28)

Does this box run any SAR?
no

What services does this box provide and who has access to them?
it only houses db processes,the dba has the access to this process.

Have any daemons logged data in the period?
none also.

Have these hangs or log blackouts been happening before or not?
it happened 3x already. all have the same scenario where logs are not present.

Is SW regularly updated?
what does SW stand for?

Does "rpm -Va --noscripts" look OK?
how do I know it the result is OK?


Do you people keep a log of (admin) change reports for the box?
no. we do not keep them.

Any other things out of the ordinary you should mention?
none so far.
 
Old 07-04-2006, 07:35 AM   #4
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
Where is this box located on the network?
yes, this box is located on a network. it has a public ip and a private ip.

Are any services accessable on the public interface? Is the box firewalled on all interfaces?


Was /etc/syslog.conf changed or does it contain default values?
it contains the default value.

* If nothing else will then it may prove valuable to have processes log (more) verbose and catch that with a "*.*" entry in syslog.conf. Downside is that you will need a lot of free diskspace and maybe schedule extra logrotates for those logs to combat extreme loggrowth.


Are all logins "last" reports accounted for?
yes, all logins are accounted for. there is also an entry where it shows that a user was connected from a certain time to a point where it crashed. e.g.

someuser pts/0 someip Sat Jul 1 09:46 - 10:11 (00:24)
reboot system boot 2.6.9-22.0.1.ELs Sat Jul 1 09:45 (1+21:32)
someuser pts/2 someip Sat Jul 1 06:17 - crash (03:28)

Is "someuser" human? It's fine for this user to be there on a saturday at 06 AM? And "someip" is allowed to access the box? And is "someuser" on pts/0 the same as the earlier entry? * You can have every command issued by a user logged if you wrap their default shell with "rootsh".


Does this box run any SAR?
no

* Then maybe you should. Ideally you should first save default values right after boot to diff against. Look for "Atsar" (maybe DAG has an EL4 .rpm) or "Dstat" (needs Python, its CSV output is easy to chart in OOo) or maybe remote through any SNMP monitoring SW like say Nagios (OK, you need snmpd for that).


What services does this box provide and who has access to them?
it only houses db processes,the dba has the access to this process.

I imagine the db is being used by applications (on adjacent boxen)? Maybe give more details what db you're running, what it's used by, if it's a recent SW (software) version, any problems encountered in the past with any of it etc, etc.


Have any daemons logged data in the period?
none also.

Can they log verbose?


Have these hangs or log blackouts been happening before or not?
it happened 3x already. all have the same scenario where logs are not present.

Could be anything from deliberate resets to memory leaks to overheating.
At this point there's not enough info to even try to speculate.
The more you log the more chance you have narrowing it down.


Is SW regularly updated?
what does SW stand for?

Software. HW is hardware and "wetware" are "lusers" or human users. Some speak of "meatware" because them admins tend to grind them for lunch but I think that's pushing it too far. Dinner is OK I think ;-p


Does "rpm -Va --noscripts" look OK?
how do I know it the result is OK?

You know because you've opened up "man rpm" and looked for what it reports (S, M, 5, U, G, etc, etc) under "VERIFY OPTIONS"?


Do you people keep a log of (admin) change reports for the box?
no. we do not keep them.

Running servers in a professional environment is all about stability. Anything that "threatens" stability should be investigated, mended and logged. This provides a history of stuff encountered and fixed and is also efficient for sharing information.
 
Old 07-04-2006, 08:57 PM   #5
gn00kie
Member
 
Registered: Jan 2006
Posts: 70

Original Poster
Rep: Reputation: 15
@unSpawn,
it is actually an oracle account. the night shift guys are the ones logged on during that time. they have some scripts that they need to run to check availability of server.
 
Old 07-04-2006, 10:32 PM   #6
Capt_Caveman
Senior Member
 
Registered: Mar 2003
Distribution: Fedora
Posts: 3,658

Rep: Reputation: 69
Moved: This thread is more suitable in the Linux Security forum and has been moved accordingly to help your thread/question get the exposure it deserves.
 
Old 07-07-2006, 01:39 AM   #7
gn00kie
Member
 
Registered: Jan 2006
Posts: 70

Original Poster
Rep: Reputation: 15
I got logs from our nagios server that it detected cpu and memory resouce shortage within the time the syslogd also stopped logging. I could assume that a process has used all the resources. How could I trap what process is doing this one? TIA.
 
Old 07-09-2006, 05:13 PM   #8
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
Check out Atop.

Last edited by unSpawn; 07-09-2006 at 05:17 PM.
 
Old 07-09-2006, 11:03 PM   #9
gn00kie
Member
 
Registered: Jan 2006
Posts: 70

Original Poster
Rep: Reputation: 15
hi unSpawn,
I have already installed atop on our system, I would just like to ask how to configure the /etc/atop/atop.24hours script? Should I put in on the crontab or should I just modify /etc/logrotate.d/pacct "postrotate" parameter. I read the man of atop ang got confused with the script files part. TIA
 
Old 07-10-2006, 03:15 AM   #10
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
If installed as RPM you get a crontab file /etc/cron.d/atop.
Change it or remove and use /etc/crontab.
 
Old 07-10-2006, 05:15 AM   #11
gn00kie
Member
 
Registered: Jan 2006
Posts: 70

Original Poster
Rep: Reputation: 15
ok..i already found the entry..thanks for your help...
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Server hang sunhui Linux - Software 1 03-21-2006 09:05 PM
setuid programs hang when run by non-root users in RH9 jbkrash Red Hat 0 09-21-2004 02:17 PM
server get's Hang !!! hitesh_linux Linux - General 1 03-21-2003 05:16 AM
Congrats Fin [bow] ... taz.devil General 5 05-23-2002 07:40 PM
server UNC hang athenerx Linux - Newbie 0 06-13-2001 02:31 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Security

All times are GMT -5. The time now is 02:11 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration