By rahulk at 2007-02-06 12:10
It is a problem area which we have faced a lot which causes the services and server to hang up whenever a server which is mounted on another Liux server goes down.
NFS type mounted file systems by default, allow the "intr" flag which allows killing of those processes which are accessing a partition which no longer exists (since the mounted server has gone down). In a normal scenario, this "intr" flag is enogh to kill those processes which are in a hung state due to unavailability of the mounted file system. But there have been many scenarios in which even the "intr" flag is not enough and the process does not goes down.
There is a very cool trick to kill these processes. Following is the way:-
Whenever you hit the command to unmount a NFS partition, the first thing the OS does is to check whether any process is currently using the partition or not. You can use the following command to figure this out:-
If the processes which are currently using this partition does not respond to "kill -9" command then you WILL have to have the MOUNTED partition available anyways before killing the process. So here is a work-around of this situation.
1. Create a virtual interface on the problem server which has the IP address same as the mounted server (which has gone down).
Lets say my mounted server is having an IP 192.168.1.3 and my server which is having the mounted partition is 192.168.1.4.
Now to kill the hunged processes in 192.168.1.4, we will have to bring the server (192.168.1.3) up inspite of the fact that it is actually down ;). So create a new virtual interface on ethernet card of 192.168.1.4.
ifconfig eth0:1 192.168.1.3 netmask 255.255.252.0
Now what we have done here is we have brought the remote server IP up ;).
Now you can kill the processes and unmount the file system.
After finishing up, remember to bring down the interface.
ifconfig eth0:1 down
Thats it!! Why this works is because of the simple reason that the mount and unmount commands always take into account the IP address which mounting/unmounting rather then the hostname. So this will always work out for you.