LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Red Hat (http://www.linuxquestions.org/questions/red-hat-31/)
-   -   HELP - Resource/Performance Monitoring Script - Red Hat Enterprise Linux Server (http://www.linuxquestions.org/questions/red-hat-31/help-resource-performance-monitoring-script-red-hat-enterprise-linux-server-869477/)

newbie01.linux 03-18-2011 03:06 PM

HELP - Resource/Performance Monitoring Script - Red Hat Enterprise Linux Server
 
Hi all,

-------------------------
Linux OS Version/Release:
-------------------------

Red Hat Enterprise Linux Server release 5.5 (Tikanga)
Linux <hostname> 2.6.18-194.8.1.el5 #1 SMP Wed Jun 23 10:52:51 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

I have a server that hosts 30+ Oracle databases. Each database has its own set of scripts that shuts down the database and start the database. Things has been working "smoothly" in the last couple of months until this week.

My log in most cases shows errors like the ones below:

cannot fork [Resource temporarily unavailable]
Connection reset by peer

I am 100% sure that the scripts are not at fault since it has been working with no errors for months. I want to monitor the server for what resource or configurations I have to tweak to get around this problem if possible, am I exceeding my ulimit settings running out of process, out of memory/swap space etc?

I wish I could stay up 24x7 to monitor to monitor the server but I can't. Can anyone please advise if there is any monitoring script available somewhere that I can put in cron to temporarily monitor the server for resource issues, for example memory used/left, swap space used/left, ulimit-num-process used/left, nofiles used/left etc.

Any help/advise/suggestion will be much appreciated. Thanks in advance.

mesiol 03-20-2011 03:27 AM

Somewhat less information, you did not mention the hardware, number of user session, number of oracle sessions, who did create the error message? Syslog? Oracle database software? When does this errror message occure?

You should check the man pages for
Code:

ulimit
Lsof
Sar
Iostat
Vmstat
Top

Also take a look at
Code:

/var/log/message
Also take a look at the oracle database instance alert.log


These will help you determining the reason for your problem.


All times are GMT -5. The time now is 11:46 PM.