LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Cannot kill process (https://www.linuxquestions.org/questions/linux-general-1/cannot-kill-process-312116/)

Zeno McDohl 04-11-2005 02:48 PM

Cannot kill process [not fixed]
 
Here's the process I need to kill. (It's caught in a loop)
Code:

USER      PID %CPU %MEM  VSZ  RSS TTY      STAT START  TIME COMMAND
zeno      2717  0.0  0.7  8828 5456 ?        T    10:44  0:00 ../src/inuyasha 1801

So I "kill 2717". The process is still there. So then I try "kill -SIGKILL 2717". And yet it's still there. I've tried logging on and off, and all other processes on my user are killed. How can I kill this? I talked to the admin, and was found out that root can't kill it. The only way so far I know of killing it is to wait for a server reboot. But I'm not about to do that.

Hammett 04-11-2005 03:01 PM

Have you tried to use "kill -9 PID" ? (kill -9 2717 in the example)

Tinkster 04-11-2005 03:02 PM

Has it got a parent PID still that you could kill?


Cheers,
Tink

Zeno McDohl 04-11-2005 03:03 PM

Yes. Last night I did. Tried again:
Code:

[zeno@boralis iyg]$ ps ux
USER      PID %CPU %MEM  VSZ  RSS TTY      STAT START  TIME COMMAND
zeno      2717  0.0  0.7  8828 5456 ?        T    Apr10  0:00 ../src/inuyasha 1801
zeno    12871  0.0  0.2  8288 2156 ?        S    13:03  0:00 sshd: zeno@pts/9
zeno    12872  0.0  0.1  6528 1272 pts/9    S    13:03  0:00 -bash
zeno    12901  0.0  0.1  3364  860 pts/9    R    13:04  0:00 ps ux
[zeno@boralis iyg]$ kill -9 2717
[zeno@boralis iyg]$ ps ux
USER      PID %CPU %MEM  VSZ  RSS TTY      STAT START  TIME COMMAND
zeno      2717  0.0  0.7  8828 5456 ?        T    Apr10  0:00 ../src/inuyasha 1801
zeno    12871  0.0  0.2  8288 2156 ?        S    13:03  0:00 sshd: zeno@pts/9
zeno    12872  0.0  0.1  6528 1276 pts/9    S    13:03  0:00 -bash
zeno    12902  0.0  0.1  4252  860 pts/9    R    13:04  0:00 ps ux

[EDIT]
No parent either:
Code:

[zeno@boralis iyg]$ pstree -G zeno
inuyasha

sshd───bash───pstree


Tinkster 04-11-2005 03:06 PM

Quote:

Originally posted by Hammett
Have you tried to use "kill -9 PID" ? (kill -9 2717 in the example)
-9 is just the numeric representation of -SIGKILL
which he had done anyway ...



Cheers,
Tink

Tinkster 04-11-2005 03:14 PM

Quote:

Originally posted by Zeno McDohl
[EDIT]
No parent either:
Code:

[zeno@boralis iyg]$ pstree -G zeno
inuyasha

sshd───bash───pstree


ps -l, please ...

and can you tell us what the program does?
Maybe it's an I/O wait, a file lock ... what does
lsof say in respect to it?


Cheers,
Tink

Zeno McDohl 04-11-2005 03:17 PM

Code:

[zeno@boralis iyg]$ ps -l
F S  UID  PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
0 S  1003 12998 12997  0  75  0 -  1223 wait  pts/9    00:00:00 bash
0 R  1003 13028 12998  0  76  0 -  743 -      pts/9    00:00:00 ps

It's a mud. (Online text game)
See: http://www.mudconnect.com/mudfaq/mudfaq-p1.html

lsof? I don't think the host has that installed.
[EDIT] Okay, lsof has a manual, but no such command.

Tinkster 04-11-2005 03:40 PM

Well ... this may sound odd ... but is the process gone now,
by any chance? It should be showing in ls -l if it was still
hanging in there ... as for the lsof ... it's probably just not
in your path. Try with /usr/sbin/lsof or /sbin/lsof ... ;)


Cheers,
Tink

Zeno McDohl 04-11-2005 03:44 PM

The process is still there:
Code:

[zeno@boralis src]$ ps ux
USER      PID %CPU %MEM  VSZ  RSS TTY      STAT START  TIME COMMAND
zeno      2717  0.0  0.7  8828 5456 ?        T    Apr10  0:00 ../src/inuyasha 1801
zeno    12997  0.0  0.2  8644 2164 ?        S    13:18  0:00 sshd: zeno@pts/9
zeno    12998  0.0  0.1  4892 1268 pts/9    S    13:18  0:00 -bash
zeno    13300  0.0  0.1  3812  860 pts/9    R    13:42  0:00 ps ux

lsof:
Code:

[zeno@boralis sbin]$ ./lsof -p 2717
COMMAND  PID USER  FD  TYPE  DEVICE    SIZE    NODE NAME
inuyasha 2717 zeno  cwd    DIR    3,7    4096 13915317 /home/iyg/mud/area
inuyasha 2717 zeno  rtd    DIR    3,3    4096        2 /
inuyasha 2717 zeno  txt    REG    3,7 2994891 13913248 /home/iyg/mud/src/inuyasha (deleted)
inuyasha 2717 zeno  mem    REG    3,5  65548    21684 /usr/lib/libz.so.1.2.1.2
inuyasha 2717 zeno  mem    REG    3,3  108200    32230 /lib/ld-2.3.4.so
inuyasha 2717 zeno  mem    REG    3,3 1521596    32232 /lib/tls/libc-2.3.4.so
inuyasha 2717 zeno  mem    REG    3,3  28600    32322 /lib/libcrypt-2.3.4.so
inuyasha 2717 zeno  mem    REG    3,3  47444    32243 /lib/libnss_files-2.3.4.so
inuyasha 2717 zeno  mem    REG    3,6  244020    48351 /var/db/nscd/hosts
inuyasha 2717 zeno    0u  CHR  136,2                4 /dev/pts/2 (deleted)
inuyasha 2717 zeno    1w  REG    3,7  45575 13913629 /home/iyg/mud/log/1010.log
inuyasha 2717 zeno    2w  REG    3,7  45575 13913629 /home/iyg/mud/log/1010.log
inuyasha 2717 zeno    3r  CHR    1,3            70349 /dev/null
inuyasha 2717 zeno    4r  CHR    1,3            70349 /dev/null
inuyasha 2717 zeno    5u  IPv4 5572770              TCP *:1801 (LISTEN)
inuyasha 2717 zeno    6u  IPv4 5572771              TCP *:1803 (LISTEN)
inuyasha 2717 zeno    7u  sock    0,4          5582077 can't identify protocol
inuyasha 2717 zeno    8u  sock    0,4          5582029 can't identify protocol
inuyasha 2717 zeno    9u  sock    0,4          5582087 can't identify protocol
inuyasha 2717 zeno  10u  IPv4 5582095              TCP boralis.arthmoor.com:1801->cpe-24-194-57-197.nycap.res.rr.com:12792 (CLOSE_WAIT)
inuyasha 2717 zeno  11u  sock    0,4          5581499 can't identify protocol
inuyasha 2717 zeno  12u  sock    0,4          5580670 can't identify protocol
inuyasha 2717 zeno  13u  IPv4 5577663              TCP boralis.arthmoor.com:1801->CPE0050ba85f271-CM000a739b636b.cpe.net.cable.rogers.com:ms-sql-s (CLOSE_WAIT)
inuyasha 2717 zeno  14u  IPv4 5580819              TCP boralis.arthmoor.com:1801->cpe-24-194-57-197.nycap.res.rr.com:12726 (CLOSE_WAIT)
inuyasha 2717 zeno  15u  sock    0,4          5582102 can't identify protocol
inuyasha 2717 zeno  16u  IPv4 5582024              TCP boralis.arthmoor.com:1801->67-40-20-129.tukw.qwest.net:supfiledbg (CLOSE_WAIT)
inuyasha 2717 zeno  17u  sock    0,4          5582028 can't identify protocol
inuyasha 2717 zeno  18u  IPv4 5579899              TCP boralis.arthmoor.com:1801->67-40-20-129.tukw.qwest.net:1036 (CLOSE_WAIT)
inuyasha 2717 zeno  19u  IPv4 5572809              TCP boralis.arthmoor.com:1801->cpe-66-65-229-123.nycap.res.rr.com:4464 (CLOSE_WAIT)
inuyasha 2717 zeno  20u  IPv4 5573766              TCP boralis.arthmoor.com:1801->ool-44c1d02e.dyn.optonline.net:2570 (CLOSE_WAIT)
inuyasha 2717 zeno  21u  sock    0,4          5582022 can't identify protocol
inuyasha 2717 zeno  22u  sock    0,4          5582100 can't identify protocol


Zeno McDohl 04-11-2005 04:08 PM

Here's the process under -xl:
Code:

[zeno@boralis iyg]$ ps xl
F  UID  PID  PPID PRI  NI  VSZ  RSS WCHAN  STAT TTY        TIME COMMAND
0  1003  2717    1  15  0  8828 5456 ptrace T    ?          0:00 ../src/inuyasha 1801
5  1003 12997 12995  15  0  8800 2172 select S    ?          0:00 sshd: zeno@pts/9
0  1003 12998 12997  15  0  4892 1284 wait  S    pts/9      0:00 -bash
0  1003 13544 12998  16  0  3676  632 -      R    pts/9      0:00 ps xl


Hammett 04-11-2005 04:24 PM

Quote:

Originally posted by Tinkster
-9 is just the numeric representation of -SIGKILL
which he had done anyway ...



Cheers,
Tink

ough...sorry, I didn't know

Zeno McDohl 04-12-2005 01:58 PM

Well, here's some background info on the parent process that ran it. It was a script named "startup" but as soon as I saw the process "lock up" I tried killing the process. That didn't work, so I killed the startup script process next. It was killed fine. Yet this process still hangs.

Tinkster 04-12-2005 03:17 PM

In that case you'll have to wait for the reboot of
the box - I don't know of any other ways of getting
the misbehaving tool back into line.



Cheers,
Tink

Zeno McDohl 04-14-2005 02:31 PM

Well, I fixed it with some help.
Code:

[zeno@boralis iyg]$ strace -p 31177
 Process 31177 attached - interrupt to quit

(Froze, nothing being shown, so I quit out of strace and did it again)
Code:

[zeno@boralis iyg]$ strace -p 31177
 Process 31177 attached - interrupt to quit
 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
 --- SIGINT (Interrupt) @ 0 (0) ---

And it just died. Does anyone know why? I'm happy though. ;)


All times are GMT -5. The time now is 09:26 AM.