Linux - File descriptors exhausted, how to recover
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Linux - File descriptors exhausted, how to recover
I am logged in as "root" and after a while my CentOS box runs out of file descriptors. How do I find out who (user/process) is polluting my server and how to get the server back without rebooting? I am on a bash shell.
I am logged in as "root" and after a while my CentOS box runs out of file descriptors. How do I find out who (user/process) is polluting my server and how to get the server back without rebooting? I am on a bash shell.
You don't say what version of CentOS you're running, how many users, what's running on the server, the environment it's in, etc., so there's no way we can even guess as to what's causing it.
CentOS 5.5. 64-bit. 4-5 users. It is a test box and guys just run test code. I use a bash shell.
So one of the guys runs a script which creates a lot of files and keeps them open so that the system runs out of file-descriptors. I am on another shell, still logged in as root. How can I recover the system?
If he's opening too many files (and the default settings are pretty high, so it shouldn't be a problem or he's doing something odd/wrong), then he needs to close them or you can kill some of his processes, which will have the same effect.
Don't mess with the system settings; that's addressing the symptom(s), not the root cause.
That is what my question is... how? All the file descriptors are exhausted by the system. You cannot run any commands (kill, lsof, ls, ps, cat, etc.) except the inbuilt shell commands. I first need to find out the offending user, the processes and then take action... since I can't run most of the commands, how to go about it?
Last edited by shib4u; 07-26-2012 at 12:59 PM.
Reason: minor edit
All the file descriptors are exhausted by the system. You cannot run any commands (kill, lsof, ls, ps, cat, etc.) except the inbuilt shell commands
then, quite honestly, I believe reboot is the only answer in this case.
HOWEVER, you might want to wait to see if one of the Mods has a better answer.
I can't think of a built-in cmd that would do what you need without using a file descriptor.
Why don't you ask your users, maybe the offender will know of a way of signalling his/her procs to die?
How do I find out who (user/process) is polluting my server and how to get the server back without rebooting?
Please check /proc/<PID>/fd directory..
If some process is not closing their file descriptor properly, in the above directory you'll see a hell lot of fds!
Please check /proc/<PID>/fd directory..
If some process is not closing their file descriptor properly, in the above directory you'll see a hell lot of fds!
Did you read what the OP posted???
Quote:
Originally Posted by shib4u
You cannot run any commands (kill, lsof, ls, ps, cat, etc.) except the inbuilt shell commands
...so, if they can't run an ls or cat....HOW will they check that directory???
Quote:
Originally Posted by shib4u
So one of the guys runs a script which creates a lot of files and keeps them open so that the system runs out of file-descriptors. I am on another shell, still logged in as root. How can I recover the system?
You don't. You reboot the system, and go to the guy who wrote that script, and tell them to not run it again, or learn how to program correctly. To solve ANY problem, you need to identify the root cause. You have...now fix it.
...so, if they can't run an ls or cat....HOW will they check that directory???
You don't. You reboot the system, and go to the guy who wrote that script, and tell them to not run it again, or learn how to program correctly. To solve ANY problem, you need to identify the root cause. You have...now fix it.
I stuck with same situation.
Let us know if you are able to run . df -i command. Here you may find 100% inode in suspected partition.
I used find command to remove 20 days older file.
Quote:
find /path/to/files* -mtime +20 -exec rm {} \;
Here, if you know what is the extention of file or the starting name of file ,you can modify the command.
In my system it was .txt so I used the command. (beware: extention of file must be unique with file otherwise do not take risk. use starting name of file)
If you have busybox on your system, you could try using it as your shell. That might let you run (some) commands without needing additional descriptors.
@sharadchhetri: inodes != file descriptors..
He basically can't run any cmds because all the useful ones require one or more file descriptors to be opened...
@sharadchhetri: inodes != file descriptors..
He basically can't run any cmds because all the useful ones require one or more file descriptors to be opened...
"So one of the guys runs a script which creates a lot of files and keeps them open so that the system runs out of file-descriptors. I am on another shell, still logged in as root. How can I recover the system?"
Shibu, are you only not able to run ls command in that directory or ls command in all directory?
Last edited by sharadchhetri; 07-29-2012 at 11:54 PM.
The only commands that have any chance of running are the shell builtins, and any of those that need to access a file or directory won't work either. If the PID of the offending process were known, you could use the kill command (a shell builtin) to terminate the process, but I don't know of any way to find that PID without being able to access /proc, and you'd need a file descriptor to do that.
Perhaps the OPs problem is as simple as changing the ulimit -n value for his shell (if his distribution permits it). From the manual page:
Code:
ulimit [-HSTabcdefilmnpqrstuvx [limit]]
Provides control over the resources available to the shell and
to processes started by it, on systems that allow such control.
The -H and -S options specify that the hard or soft limit is set
for the given resource. A hard limit cannot be increased by a
non-root user once it is set; a soft limit may be increased up
to the value of the hard limit. If neither -H nor -S is speci‐
fied, both the soft and hard limits are set. The value of limit
can be a number in the unit specified for the resource or one of
the special values hard, soft, or unlimited, which stand for the
current hard limit, the current soft limit, and no limit,
respectively. If limit is omitted, the current value of the
soft limit of the resource is printed, unless the -H option is
given. When more than one resource is specified, the limit name
and unit are printed before the value. Other options are inter‐
preted as follows:
-a All current limits are reported
-b The maximum socket buffer size
-c The maximum size of core files created
-d The maximum size of a process's data segment
-e The maximum scheduling priority ("nice")
-f The maximum size of files written by the shell and its
children
-i The maximum number of pending signals
-l The maximum size that may be locked into memory
-m The maximum resident set size (many systems do not honor
this limit)
-n The maximum number of open file descriptors (most systems
do not allow this value to be set)
Note that it's implied that this is a "per-process" limit, so a simple ctrl-alt-f2 to start a new tty session might be all that's needed. (Provided that the OP is not in an X-session with VTSwitch turned off, which is the default setting in newer Xorg releases.)
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.