LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 11-28-2017, 03:28 AM   #1
grzeslaw
Member
 
Registered: Nov 2008
Posts: 109

Rep: Reputation: 24
Question lsof: WARNING: can't stat() nfs file system


Hi guys,

On some servers, I get NFS issues randomly. This thread is not to resolve those issues, but to help me, to get lsof command, which will not hang.

Whenever on affected system I am doing following command it hangs, as per example:
Code:
nagios@myhost:~$ sudo lsof -u appuser -b
lsof: avoiding readlink(/): -b was specified.
lsof: avoiding stat(/): -b was specified.
lsof: WARNING: can't stat() rootfs file system /
      Output information may be incomplete.
lsof: avoiding readlink(/sys): -b was specified.
lsof: avoiding stat(/sys): -b was specified.
lsof: WARNING: can't stat() sysfs file system /sys
      Output information may be incomplete.
lsof: avoiding readlink(/proc): -b was specified.
lsof: avoiding stat(/proc): -b was specified.
lsof: WARNING: can't stat() proc file system /proc
      Output information may be incomplete.
lsof: avoiding readlink(/dev): -b was specified.
lsof: avoiding stat(/dev): -b was specified.
lsof: WARNING: can't stat() devtmpfs file system /dev
      Output information may be incomplete.
lsof: avoiding readlink(/dev/pts): -b was specified.
lsof: avoiding stat(/dev/pts): -b was specified.
lsof: WARNING: can't stat() devpts file system /dev/pts
      Output information may be incomplete.
lsof: avoiding readlink(/dev/pts): -b was specified.
lsof: avoiding stat(/dev/pts): -b was specified.
lsof: WARNING: can't stat() devpts file system /dev/pts
      Output information may be incomplete.
lsof: avoiding readlink(/run): -b was specified.
lsof: avoiding stat(/run): -b was specified.
lsof: WARNING: can't stat() tmpfs file system /run
      Output information may be incomplete.
lsof: avoiding readlink(/): -b was specified.
lsof: avoiding stat(/): -b was specified.
lsof: WARNING: can't stat() ext4 file system /
      Output information may be incomplete.
lsof: avoiding readlink(/run/lock): -b was specified.
lsof: avoiding stat(/run/lock): -b was specified.
lsof: WARNING: can't stat() tmpfs file system /run/lock
      Output information may be incomplete.
lsof: avoiding readlink(/run/shm): -b was specified.
lsof: avoiding stat(/run/shm): -b was specified.
lsof: WARNING: can't stat() tmpfs file system /run/shm
      Output information may be incomplete.
lsof: avoiding readlink(/boot): -b was specified.
lsof: avoiding stat(/boot): -b was specified.
lsof: WARNING: can't stat() ext2 file system /boot
      Output information may be incomplete.
lsof: avoiding readlink(/home): -b was specified.
lsof: avoiding stat(/home): -b was specified.
lsof: WARNING: can't stat() ext4 file system /home
      Output information may be incomplete.
lsof: avoiding readlink(/tmp): -b was specified.
lsof: avoiding stat(/tmp): -b was specified.
lsof: WARNING: can't stat() ext4 file system /tmp
      Output information may be incomplete.
lsof: avoiding readlink(/usr): -b was specified.
lsof: avoiding stat(/usr): -b was specified.
lsof: WARNING: can't stat() ext4 file system /usr
      Output information may be incomplete.
lsof: avoiding readlink(/var): -b was specified.
lsof: avoiding stat(/var): -b was specified.
lsof: WARNING: can't stat() ext4 file system /var
      Output information may be incomplete.
lsof: avoiding readlink(/app-logs): -b was specified.
lsof: avoiding stat(/app-logs): -b was specified.
lsof: WARNING: can't stat() ext4 file system /app-logs
      Output information may be incomplete.
lsof: avoiding readlink(/var/log/app): -b was specified.
lsof: avoiding stat(/var/log/app): -b was specified.
lsof: WARNING: can't stat() ext4 file system /var/log/app
      Output information may be incomplete.
lsof: avoiding readlink(/opt): -b was specified.
lsof: avoiding stat(/opt): -b was specified.
lsof: WARNING: can't stat() ext4 file system /opt
      Output information may be incomplete.
lsof: avoiding readlink(/var/lib/nfs/rpc_pipefs): -b was specified.
lsof: avoiding stat(/var/lib/nfs/rpc_pipefs): -b was specified.
lsof: WARNING: can't stat() rpc_pipefs file system /var/lib/nfs/rpc_pipefs
      Output information may be incomplete.
lsof: avoiding readlink(/NFS_DATA): -b was specified.
lsof: avoiding stat(/NFS_DATA): -b was specified.
lsof: WARNING: can't stat() nfs file system /NFS_DATA
      Output information may be incomplete.
As you see I already use "-b" option which should skip those kernel functions that might block, but looks like its hanging anyway.

I need to run this command in my monitoring script, to list files open as user. On affected systems, when they get nfs issue, command hangs, and leave orphaned processes, what increase system load. So whenever issue is present, it might possible that in 1 day we get 288 orphaned processes, as plugin is executing every 5minutes.

Your help will be kindly appreciated!
Thanks,

Last edited by grzeslaw; 11-28-2017 at 03:39 AM.
 
Old 12-02-2017, 05:08 AM   #2
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 1,490

Rep: Reputation: 677Reputation: 677Reputation: 677Reputation: 677Reputation: 677Reputation: 677
I think you need to supervise lsof, and let the supervision thread kill -9 lsof after some time.
Do you have the timeout command?
 
Old 12-02-2017, 05:47 AM   #3
grzeslaw
Member
 
Registered: Nov 2008
Posts: 109

Original Poster
Rep: Reputation: 24
Thanks for your comment!

Yes, I've a timeout value, but it hangs forever.. Other thing is that I am running script as nagios user, so I can't kill it after some time because of user permissions. Ok, I can add kill to sudo for nagios, but this is not a solution, as it require sudoers modification on thousands of hosts, so I really prefer to do some workaround in the script which I am responsible for, to validate if NFS share is working properly, than if yes, start the lsof command, otherwise put error and exit0.

Interesting thing which I found, is that in kern.log I see messages, regarding NFS:
Code:
~# dmesg -T|tail -2
[Fri Dec  1 00:23:15 2017] nfs: server 10.10.4.30 not responding, still trying
[Fri Dec  1 02:37:32 2017] nfs: server 10.10.4.30 not responding, still trying
I get an idea to simply do "dmesg -T|tail 2|grep "not responding" and put it into my script, that way I can avoid doing lsof when there is an issue with NFS, but.. those alerts are regarding old Netapp. now we've a new one with different IP from months, so this is strange.. I think I need to find another way to check if nfs is have no issues..
 
Old 12-03-2017, 02:27 AM   #4
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 1,490

Rep: Reputation: 677Reputation: 677Reputation: 677Reputation: 677Reputation: 677Reputation: 677
I meant the timeout command
Code:
man timeout
Then you can run lsof with another timeout, for example
Code:
timout --signal=9 55s lsof ...
 
Old 12-03-2017, 01:52 PM   #5
grzeslaw
Member
 
Registered: Nov 2008
Posts: 109

Original Poster
Rep: Reputation: 24
Thanks!

Looks that this command works from CLI, what is great!
Sadly, when I put this code into script, it hangs forever ;/

Code:
nagios@host002:~$ timeout --signal=9 5s sudo lsof -u user1 2>/dev/null|wc -l
Killed
nagios@host002:~$ echo "timeout --signal=9 5s sudo lsof -u user1 2>/dev/null|wc -l" >test.sh
nagios@host002:~$ 
nagios@host002:~$ chmod +x test.sh 
nagios@host002:~$ 
nagios@host002:~$ 
nagios@host002:~$ ./test.sh 

^C
nagios@host002:~$
 
Old 12-06-2017, 12:34 PM   #6
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 1,490

Rep: Reputation: 677Reputation: 677Reputation: 677Reputation: 677Reputation: 677Reputation: 677
Wrong order: first sudo then timeout lsof!
Then the kill -9 is done with root rights.
Code:
sudo timeout --signal=9 5s lsof -u user1 2>/dev/null|wc -l
 
Old 12-14-2017, 03:27 AM   #7
grzeslaw
Member
 
Registered: Nov 2008
Posts: 109

Original Poster
Rep: Reputation: 24
None of those solution works, when server have NFS issue. lsof constantly hangs.
To resolve it, I wrote own lsof, basing on /proc/PID/(smaps|fd) variables.
Taking this into account, we could assume that issue is resolved.
 
Old 12-14-2017, 01:17 PM   #8
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 1,490

Rep: Reputation: 677Reputation: 677Reputation: 677Reputation: 677Reputation: 677Reputation: 677
Well done.
More and more often I face "featurism" that puts the base function at risk, and would like to write my own "simply works" programs...
 
Old 12-14-2017, 01:53 PM   #9
grzeslaw
Member
 
Registered: Nov 2008
Posts: 109

Original Poster
Rep: Reputation: 24
Sometimes you have no choice. This is not the first time, when was forced to code my own functions, its life.. But the good thing is that you know exactly what it does, and you can quickly implement some fix in case of other issues
 
Old 12-27-2017, 09:27 AM   #10
PinoyUser
LQ Newbie
 
Registered: Dec 2017
Posts: 1

Rep: Reputation: Disabled
Use strace to find out where it is hanging

Try:
strace lsof
ps aux | grep <pid where it is hanging>
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
cannot mount nfs file system after nfs-utils-1.3.3 jrfr Slackware 4 01-01-2016 01:38 AM
lsof and nfs connections gettons1980 Linux - Networking 2 10-12-2011 07:27 AM
lsof warning message jsyzghan Linux - Software 1 07-17-2009 05:36 PM
export NFS mounted file system via NFS smkamene Linux - Networking 3 02-10-2009 03:12 PM
nfs server: can't stat exported dir r_ramp Linux - Server 1 10-24-2008 05:06 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 12:00 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration