LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   pgrep command could not be killed. (https://www.linuxquestions.org/questions/linux-newbie-8/pgrep-command-could-not-be-killed-4175583246/)

prabhuP 06-28-2016 07:23 AM

pgrep command could not be killed.
 
Hi All,
I have a query for the pgrep command which is running from a script and its not returning. The script is scheduled to run in every one minute. As a result many instances of pgrep command are running. Am not able to kill them with normal kill -9 command. The system also hangs while running any command. I do not want to reboot since it is a production server. Please help me to get out of this issue.

Sharing the below proc output

cat /proc/65535/status
Name: pgrep
State: D (disk sleep)
Tgid: 65535
Ngid: 0
Pid: 65535
PPid: 65532
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 64
Groups: 0
VmPeak: 8664 kB
VmSize: 8664 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 1164 kB
VmRSS: 832 kB
VmData: 784 kB
VmStk: 136 kB
VmExe: 20 kB
VmLib: 1976 kB
VmPTE: 40 kB
VmSwap: 380 kB
Threads: 1
SigQ: 239/256918
SigPnd: 0000000000000100
ShdPnd: 0000000000000100
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000000000000
CapInh: 0000000000000000
CapPrm: 0000001fffffffff
CapEff: 0000001fffffffff
CapBnd: 0000001fffffffff
Seccomp: 0
Cpus_allowed: ffffffff,ffffffff
Cpus_allowed_list: 0-63
Mems_allowed: 00000000,00000003
Mems_allowed_list: 0-1
voluntary_ctxt_switches: 1
nonvoluntary_ctxt_switches: 9








cat /proc/65535/sched
pgrep (65535, #threads: 1)
-------------------------------------------------------------------
se.exec_start : 472423396.990711
se.vruntime : 99.702842
se.sum_exec_runtime : 88.102031
se.statistics.wait_start : 0.000000
se.statistics.sleep_start : 0.000000
se.statistics.block_start : 472423396.990711
se.statistics.sleep_max : 0.000000
se.statistics.block_max : 0.000000
se.statistics.exec_max : 4.001890
se.statistics.slice_max : 0.000000
se.statistics.wait_max : 0.343912
se.statistics.wait_sum : 0.916507
se.statistics.wait_count : 11
se.statistics.iowait_sum : 0.000000
se.statistics.iowait_count : 0
se.nr_migrations : 2
se.statistics.nr_migrations_cold : 0
se.statistics.nr_failed_migrations_affine : 0
se.statistics.nr_failed_migrations_running : 10
se.statistics.nr_failed_migrations_hot : 1
se.statistics.nr_forced_migrations : 0
se.statistics.nr_wakeups : 0
se.statistics.nr_wakeups_sync : 0
se.statistics.nr_wakeups_migrate : 0
se.statistics.nr_wakeups_local : 0
se.statistics.nr_wakeups_remote : 0
se.statistics.nr_wakeups_affine : 0
se.statistics.nr_wakeups_affine_attempts : 0
se.statistics.nr_wakeups_passive : 0
se.statistics.nr_wakeups_idle : 0
avg_atom : 8.810203
avg_per_cpu : 44.051015
nr_switches : 10
nr_voluntary_switches : 1
nr_involuntary_switches : 9
se.load.weight : 1024
se.avg.runnable_avg_sum : 42603
se.avg.runnable_avg_period : 42603
se.avg.load_avg_contrib : 1023
se.avg.decay_count : 450538061
policy : 0
prio : 120
clock-delta : 74
mm->numa_scan_seq : 0
numa_migrations, 0
numa_faults, 0, 0, 1, 0, -1
numa_faults, 1, 0, 0, 0, -1
numa_faults, 0, 1, 0, 0, -1
numa_faults, 1, 1, 0, 0, -1





cat /proc/65535/schedstat
88102031 916507 10
root@nskolkata:/home/user2# cat /proc/65535/stack
[<ffffffff8136ab34>] call_rwsem_down_read_failed+0x14/0x30
[<ffffffff81179782>] __access_remote_vm+0x42/0x1d0
[<ffffffff8117a560>] access_process_vm+0x50/0x70
[<ffffffff81220a7a>] proc_pid_cmdline+0x8a/0x120
[<ffffffff81221f2f>] proc_info_read+0x9f/0xf0
[<ffffffff811b93b5>] vfs_read+0x95/0x160
[<ffffffff811b9ec9>] SyS_read+0x49/0xa0
[<ffffffff8172663f>] tracesys+0xe1/0xe6
[<ffffffffffffffff>] 0xffffffffffffffff


Thanks,
Prabhu

MadeInGermany 06-28-2016 02:01 PM

What yields
Code:

ls -l /proc/65535/fd/
?

syg00 06-28-2016 07:05 PM

Quote:

Originally Posted by prabhuP (Post 5567319)
State: D (disk sleep)

Not just "disk sleep", but uninterruptible sleep. Which means exactly that.

Jjanel 06-29-2016 12:57 AM

what is the exact/full pgrep command (that gets stuck in D state)?
Maybe an strace will help.... Anything relevant hiding in dmesg?
uname -rvm ; ps -axlww | grep -w D

possible results from google of: hang "call_rwsem_down_read_failed"
(excuse my blind Newbie wild-guessing here... not sure if I got call order right)

IF your info enables someone to reproduce it exactly, THEN a solution is at hand!


All times are GMT -5. The time now is 04:19 AM.