LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Remote Server Maintenance (https://www.linuxquestions.org/questions/linux-newbie-8/remote-server-maintenance-878745/)

Jozzie 05-04-2011 05:21 AM

Remote Server Maintenance
 
Hey Linux Community,
I have a question, my first, a likely not my last. We are monitoring several Linux servers, my main concern is for two servers in particular, they are on *.*.*.49, and *.*.*.50 with three of the same instances on each, we have been arguing with our developers as their bugs, alongside Users over-searching keep causing the instances to crash. This is a simple process of logging in SSH (eg. putty), entering a username/password running an "su -" command, and again entering the root password for the respective host, then running a "grep java" command to get the process list up, finding the PSID for the offline instance and killing it, waiting for it to restart.

Now, even though this does take a little time, it is an easy process, however we have some people working on the helpdesk who are wanting to do server monitoring over weekends to earn some extra cash, and they are all but computer illiterate. I was hoping to make the process as simple as possible for them so they could see which instance was failing and simply run a script which would remotely kill the process. We already have monitoring in place, so it's just the script that is the issue.

I was wondering if any genius knows of any script that would be able to pull the PSID from the instance name, and kill said instance, without the need of using putty (or alternate)?

I'm not entirely sure this is possible, but I would be eternally grateful if anyone has any ideas that could aid me.

Thanks!

TobiSGD 05-04-2011 06:03 AM

If you already know the name of the process and this is the only process with this name you can use pgrep to get the PID.

Jozzie 05-04-2011 06:41 AM

Hi Tobi, thanks for your swift reply, however, I am already aware you can pull up the PID, my main concern is how I can simply run a kill command, ideally on a process name (as PID can change), so I can simply execute a script remotely, ie, via a html interface.

For example, one of the instances running is call mnet0104, when viewing each of the instances, we can tell when this process has crashed, all I want to do is as simply as possible, run a remote execution of a script, that will execute a kill command on that process. I would obviously need some way of logging into the server (possibly using something similar to that of a sendkey command in vbscript). Ideally this could all be done by executing one file remotely (and possibly have it prompting for a login)

TobiSGD 05-04-2011 07:09 AM

If you want to run kill on a name instead of a PID you should have a look at the killall command. Also, if you want to start your script from a web-interface you should have a look at this thread.
If you want to do it with SSH, you can start a script on your server with the ssh command like this:
Code:

ssh user@server myscript

Jozzie 05-04-2011 08:17 AM

Hi Tobi, thanks again for a quick reply.

I'll take a look at that web-interface link - though it seems unresolved?

I'll look into the killall command, not sure how to use it to just kill off one of the instances and not all of the java processes

Jozzie 05-06-2011 03:13 AM

ok, I have looked into it all, I can create a web interface with php, which is no problem, however it will be a problem if I can't automate the kill process without having to discover the PID each time.

Even though it would appear the killall command is a good idea (as the instance name wouldnt change), for some unknown reason it doesn't work.

I read: http://www.linfo.org/killall.html

which made it all sound so easy, but when I tried the killall command on our server it read:
MNet0104: no process killed

it doesn't say no process found, it just doesn't kill it, anyone got any suggestions on how I can proceed?

Jozzie 05-06-2011 06:01 AM

there are several processes, each of them a different instance, the following is just one of the ones I've pulled out of the many.
after entering:
ps -ef | grep java

/opt/IBM/WebSphere/AppServer/java/bin/java -Xbootclasspath/p:/opt/IBM/WebSphere/AppServer/java/jre/lib/ext/ibmorb.jar:/opt/IBM/WebSphere/AppServer/java/jre/lib/ext/ibmext.jar -Dwas.status.socket=48779 -classpath /opt/IBM/WebSphere/AppServer/profiles/MNetNode101/properties:/opt/IBM/WebSphere/AppServer/properties:/opt/IBM/WebSphere/AppServer/lib/bootstrap.jar:/opt/IBM/WebSphere/AppServer/lib/j2ee.jar:/opt/IBM/WebSphere/AppServer/lib/lmproxy.jar:/opt/IBM/WebSphere/AppServer/lib/urlprotocols.jar -verbose:gc -Xms768m -Xmx1536m -Dws.ext.dirs=/opt/IBM/WebSphere/AppServer/java/lib:/opt/IBM/WebSphere/AppServer/profiles/MNetNode101/classes:/opt/IBM/WebSphere/AppServer/classes:/opt/IBM/WebSphere/AppServer/lib:/opt/IBM/WebSphere/AppServer/installedChannels:/opt/IBM/WebSphere/AppServer/lib/ext:/opt/IBM/WebSphere/AppServer/web/help:/opt/IBM/WebSphere/AppServer/deploytool/itp/plugins/com.ibm.etools.ejbdeploy/runtime -Dderby.system.home=/opt/IBM/WebSphere/AppServer/derby -Dcom.ibm.itp.location=/opt/IBM/WebSphere/AppServer/bin -Djava.util.logging.configureByServer=true -Dibm.websphere.preload.classes=true -Duser.install.root=/opt/IBM/WebSphere/AppServer/profiles/MNetNode101 -Dwas.install.root=/opt/IBM/WebSphere/AppServer -Djava.util.logging.manager=com.ibm.ws.bootstrap.WsLogManager -Ddb2j.system.home=/opt/IBM/WebSphere/AppServer/cloudscape -Dserver.root=/opt/IBM/WebSphere/AppServer/profiles/MNetNode101 -DloggerDebug=0 -DserverRole=1 -DserverName=MNet0102 -Xloratio0.1 -Djava.security.auth.login.config=/opt/IBM/WebSphere/AppServer/profiles/MNetNode101/properties/wsjaas.conf -Djava.security.policy=/opt/IBM/WebSphere/AppServer/profiles/MNetNode101/properties/server.policy com.ibm.ws.bootstrap.WSLauncher com.ibm.ws.runtime.WsServer /opt/IBM/WebSphere/AppServer/profiles/MNetNode101/config MNetCell MNET101 MNet0102

I've tried "killall MNet0102"

and it comes back as "no process killed"

any suggestions?

SL00b 05-06-2011 08:29 AM

The problem is that the killall command kills processes by program name, and your program name is not MNet0102. The program name is java. Using killall in this situation would result in all three of your WAS instances being killed.

This is where pgrep is your friend. It's the functional equivalent to ps -ef | grep <string>, only all it returns is the PID of the matching process. This should find what you're looking for:

Code:

pgrep -f MNet0102
And once you've verified pgrep works, you can use its cousin pkill to actually perform the kill.

Can I ask why you want to use kill and not the stopServer.sh command?


All times are GMT -5. The time now is 05:14 AM.