LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 08-26-2009, 05:22 AM   #1
gjb1002
LQ Newbie
 
Registered: Aug 2009
Posts: 8

Rep: Reputation: 0
How to kill remote processes started with SSH?


Hi all,

I've run into what is apparently an age-old SSH problem, which is that killing an ssh client process does not kill the remote process (unlike e.g. rsh). There seem to be lots of patches and a couple of open bugs on this topic that have been there for about 10 years or so...

Having convinced myself by googling that there is no easy solution, I'm now looking for a workaround of some sort. I'm writing a testing framework so the processes I'm running remotely could be anything at all, i.e. I only have control of the client side. Also the remote processes are of course highly unstable and I need to be able to terminate them if they hang. ssh -t won't work for me as I don't necessarily have a terminal.

Finding the remote process ID would be enough so I can do ssh <machine> kill <pid>, but I don't see any way to do that either. Just using ps, pgrep etc seems to suffer from not being able to uniquely identify the correct process, and killing the wrong process is of course very bad.

Any help appreciated.

/Geoff Bache
 
Old 08-26-2009, 07:11 AM   #2
fordeck
Member
 
Registered: Oct 2006
Location: Utah
Posts: 520

Rep: Reputation: 61
Would the use of the 'screen' command fit your requirements? Check out it's man page:

man screen


Regards,

Fordeck
 
Old 08-26-2009, 07:45 AM   #3
gjb1002
LQ Newbie
 
Registered: Aug 2009
Posts: 8

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by fordeck View Post
Would the use of the 'screen' command fit your requirements? Check out it's man page:

man screen


Regards,

Fordeck
Forgive me if I'm being dense, but "screen" appears to be a window manager, whereas I'm trying to execute and terminate shell commands remotely. I don't really get what they have to do with each other.

Could you elaborate?

/Geoff
 
Old 08-26-2009, 07:45 AM   #4
unSpawn
Moderator
 
Registered: May 2001
Posts: 27,457
Blog Entries: 54

Rep: Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897
...and if that doesn't work for you then
Quote:
Originally Posted by gjb1002 View Post
Finding the remote process ID would be enough so I can do ssh <machine> kill <pid>, but I don't see any way to do that either. Just using ps, pgrep etc seems to suffer from not being able to uniquely identify the correct process, and killing the wrong process is of course very bad.
maybe 'pgrep' can do more for you than you maybe realize: pattern matching, PPID, SID and other selectors. Maybe posting a few examples could enable us to help you a wee bit better.
 
Old 08-26-2009, 08:27 AM   #5
gjb1002
LQ Newbie
 
Registered: Aug 2009
Posts: 8

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by unSpawn View Post
...and if that doesn't work for you then

maybe 'pgrep' can do more for you than you maybe realize: pattern matching, PPID, SID and other selectors. Maybe posting a few examples could enable us to help you a wee bit better.
OK. Suppose that a program called "system_under_test.sh" is being tested.
My test tool submits this via
ssh <machine> system_under_test.sh and off it goes. Sometime later it has hung and the user tries to kill this run. The test tool now needs to find its process ID on the remote machine.

I can do pgrep system_under_test.sh but there's no guarantee that there aren't other instances of that program running on that machine. I can check that the parent process is "sshd" but that doesn't guarantee uniqueness either.
I've looked through "man pgrep" but can't see anything that could tie it to the ssh process that is still running locally.

Regards,
Geoff
 
Old 08-26-2009, 09:10 AM   #6
unSpawn
Moderator
 
Registered: May 2001
Posts: 27,457
Blog Entries: 54

Rep: Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897
When you 'ssh <machine> system_under_test.sh' on the remote (the sshd opens up a privilege separation thread and) on login an instance of the shell is run with a PID. On execution 'system_under_test.sh' inherits the parents PID as the PPID (as in `pgrep -lP $PPID`) for its lifespan. Mixing up grepping processes for an arbitrary string in the argv[0], matching the PPID and (p)kill them looks feasible to me.
 
Old 08-26-2009, 09:36 AM   #7
gjb1002
LQ Newbie
 
Registered: Aug 2009
Posts: 8

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by unSpawn View Post
When you 'ssh <machine> system_under_test.sh' on the remote (the sshd opens up a privilege separation thread and) on login an instance of the shell is run with a PID. On execution 'system_under_test.sh' inherits the parents PID as the PPID (as in `pgrep -lP $PPID`) for its lifespan. Mixing up grepping processes for an arbitrary string in the argv[0], matching the PPID and (p)kill them looks feasible to me.
When I do "ssh <machine> sleep 1000" I get the following output from ps :

Quote:
4 S root 10985 20994 1 75 0 - 21991 stext 16:31 ? 00:00:00 sshd: geoff [priv]
5 S geoff 10994 10985 0 76 0 - 22019 stext 16:31 ? 00:00:00 sshd: geoff@notty
0 S geoff 11019 10994 3 78 0 - 18208 rt_sig 16:31 ? 00:00:00 tcsh -c sleep 1000
0 S geoff 11364 11019 0 78 0 - 14730 - 16:31 ? 00:00:00 sleep 1000
In this case I would be trying to kill process "11364". This has PPID "11019".

I don't see how I could know that progamatically though. There is nothing unique about that process, is there? If I knew this number I could of course find the right process, but how would I get it in the first place?

/Geoff
 
Old 08-26-2009, 09:55 AM   #8
unSpawn
Moderator
 
Registered: May 2001
Posts: 27,457
Blog Entries: 54

Rep: Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897
Essentially you would want to log the PPID on start of "sleep 1000" so you don't have to waste time in looking it up. 'pgrep -f 'sleep 1000'' should yield the process' PID, 'ps --no-headers -p $PID -o ppid' the PIDs PPID, then pkill with "-P $PPID" would only claim kills within that PPID.
 
Old 08-26-2009, 10:20 AM   #9
viajante
LQ Newbie
 
Registered: Dec 2004
Posts: 7

Rep: Reputation: 0
Quote:
Originally Posted by gjb1002 View Post
Forgive me if I'm being dense, but "screen" appears to be a window manager, whereas I'm trying to execute and terminate shell commands remotely. I don't really get what they have to do with each other.

Could you elaborate?
The original poster hasn't replied, so let me echo his advice: man screen

IMO, all the other options presented to you are simply attacking a very simple problem with a bigger hammer, and a hammer is not the tool you want. The correct tool for you is probably "screen".

"screen" is a whole lot more than a screen manager. I recommend you read "autodetach on|off" portion of the man page.

Wish I were more of an expert regarding screen, so I could give you specific commands and examples, but I'm not. You'll have to do some reading up on it to fully appreciate it's capabilities, or find a bigger hammer :-).

-Al
 
Old 08-26-2009, 11:26 AM   #10
fordeck
Member
 
Registered: Oct 2006
Location: Utah
Posts: 520

Rep: Reputation: 61
Screen creates virtual terminals that you can control and interact with via one terminal. What's even better, a screen session can be disconnected without killing a running task. Imagine starting a long compile on a remote server and the connection dies; when the connection dies, so does your task. Screen works around this by allowing you to detach from a running session, log out, and resume it later, even from a different location.

To get started, make sure the screen package is installed using your distribution's package manager and then type:

$ screen

This will start screen and open a new session. To disconnect from a session, type CTRL-A then d. You will return to the prompt from which you issued screen, but everything you have done in screen is still available. If only one screen session is running you can reconnect to it by logging in again if necessary and then using:

$ screen -R

If there are multiple screen sessions running, this won't work; but you can view a list of running screen sessions by using:

$ screen -list

There are screens on:

******* 13995.pts-0.host******* (Detached)

******* 14529.pts-0.host******* (Attached)

2 Sockets in /home/joe/tmp.

Here, you can see there are two sessions running. To connect to the detached session from a different location, you would use (after SSH-ing to that machine, of course):

$ screen -r 13995

where 13995 is the process ID of the screen session you wish to attach to.

There is a lot of help available for screen, and a lot of things you can do with it. You can view the screen manpage (http://www.slac.stanford.edu/comp/un.../screen.1.html), the output of screen --help, and within a screen session, type CTRL-A then ? to get a list of commands you can use when in command mode (invoked by CTRL-A).

Regards,

Fordeck
 
Old 08-26-2009, 11:32 AM   #11
gjb1002
LQ Newbie
 
Registered: Aug 2009
Posts: 8

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by unSpawn View Post
Essentially you would want to log the PPID on start of "sleep 1000" so you don't have to waste time in looking it up. 'pgrep -f 'sleep 1000'' should yield the process' PID, 'ps --no-headers -p $PID -o ppid' the PIDs PPID, then pkill with "-P $PPID" would only claim kills within that PPID.
OK, but that's exactly the problem. I don't have control of the script I'm running : the test framework could be testing any program at all. So adding remote-side logging isn't trivial.

In any case, surely "pgrep -f <command line>" could end up matching other instances of that command line running on that machine. There isn't any way to guarantee I'm not going to kill some other test run, possibly even belonging to a different user (on the remote test machine everyone has the same user ID).
 
Old 08-26-2009, 12:00 PM   #12
unSpawn
Moderator
 
Registered: May 2001
Posts: 27,457
Blog Entries: 54

Rep: Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897
Quote:
Originally Posted by gjb1002 View Post
(on the remote test machine everyone has the same user ID).
Next time please try to elaborate in your OP.
 
Old 08-26-2009, 01:06 PM   #13
gjb1002
LQ Newbie
 
Registered: Aug 2009
Posts: 8

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by unSpawn View Post
Next time please try to elaborate in your OP.
Well, of course I tried to do that. But this fact alone isn't a killer factor, it's just an extra complication. The same user could easily be running multiple tests of the same program (I do that all the time) and there is still the issue of not having any control of the script being run.

Reading "man screen" and still trying to understand how it relates to my porblem...
 
Old 08-26-2009, 01:29 PM   #14
gjb1002
LQ Newbie
 
Registered: Aug 2009
Posts: 8

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by fordeck View Post
Screen creates virtual terminals that you can control and interact with via one terminal. What's even better, a screen session can be disconnected without killing a running task. Imagine starting a long compile on a remote server and the connection dies; when the connection dies, so does your task. Screen works around this by allowing you to detach from a running session, log out, and resume it later, even from a different location.

To get started, make sure the screen package is installed using your distribution's package manager and then type:

$ screen

This will start screen and open a new session. To disconnect from a session, type CTRL-A then d. You will return to the prompt from which you issued screen, but everything you have done in screen is still available. If only one screen session is running you can reconnect to it by logging in again if necessary and then using:

$ screen -R

If there are multiple screen sessions running, this won't work; but you can view a list of running screen sessions by using:

$ screen -list

There are screens on:

******* 13995.pts-0.host******* (Detached)

******* 14529.pts-0.host******* (Attached)

2 Sockets in /home/joe/tmp.

Here, you can see there are two sessions running. To connect to the detached session from a different location, you would use (after SSH-ing to that machine, of course):

$ screen -r 13995

where 13995 is the process ID of the screen session you wish to attach to.

There is a lot of help available for screen, and a lot of things you can do with it. You can view the screen manpage (http://www.slac.stanford.edu/comp/un.../screen.1.html), the output of screen --help, and within a screen session, type CTRL-A then ? to get a list of commands you can use when in command mode (invoked by CTRL-A).

Regards,

Fordeck
Right, thanks for the info.

Did you envisage I would run "screen" locally or remotely (or even instead of SSH somehow)?

I tried

ssh <machine> screen sleep 100

but it said

Must be connected to a terminal.

I tried

screen ssh <machine> sleep 100

But terminating that didn't even terminate the ssh process, never mind the remote sleep process, even with "autodetach off" in my .screenrc

Regards,
Geoff
 
Old 08-26-2009, 03:06 PM   #15
fordeck
Member
 
Registered: Oct 2006
Location: Utah
Posts: 520

Rep: Reputation: 61
This may or may not be what you are trying to accomplish, here is what I was thinking from your original post, was that you login to the remote machine via ssh and then start a screen session. You could then start your test program. This way you could detach from it and log out of your ssh session. You then have the option to ssh from any location and attach to the detached session, which is still running your test program. You can then check on your program and kill it if necessary.

Again after looking at your previous posts, I think I probably misunderstood what you were trying to accomplish and perhaps screen is not what you are looking for.

Regards,

Fordeck
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Use only one "kill" to kill father and child processes Xosen Programming 7 08-28-2008 03:33 AM
How to kill processes using ssh remotely rajaniyer123 Solaris / OpenSolaris 3 04-13-2008 01:22 AM
how to use kill to kill a batch of processes with same name? dr_zayus69 Linux - Software 2 09-03-2005 06:35 PM
several processes started more than once at boot j-ray Linux - General 2 05-13-2004 12:37 PM


All times are GMT -5. The time now is 10:11 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration