Linux - NetworkingThis forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a heavily used linux data server. It is running RH7.3 with a custom compiled kernel. The problem with it is that after about a week of running the NFS system locks up. The computer is still running but the whole NFS system is broken and wont die. The scripts will not shutdown the nfs system neither will issuing a kill –9 on nfs processes (I also tried –15). At the same time there where about 50 smb –D processes running that also could not be killed. When I looked at the dmesg there where several of the following errors
Kernel: NFS : Task xxxxx cannot get a request slot
I do not know if this error is relevant to the problem. When I tried to reboot the computer the shutdown process immediately hung up. I then tried a halt. This also hung up not getting anywhere. In the end I resorted to a hard reset.
Does anyone have any ideas why all this may be happening?
I did check the cpu usage. I have an old laptop that constantly displays xosview or perfmeter outputs from key computers so I can see at a glance if any are being hammered.
The problem seems to be due to either the massive spawning of smb –D processes or something else causing the NFS system to lock solid.
I did not try niceing any of the processes because none of them where using any cpu.
The main problem is that when the computer goes into this state I cannot shut it down nicely. The shutdown gets to the NFS system and then hangs. I have tried leaving it in this state over night but no change. It does not timeout or complete. I have to resort to the reset button. This is not good for the system and due to the fact it has 2x75GB scsi drives on it, it takes ages to check the filesystems. The drives exist from a prior installation of Linux and the boss did not want me to try upgrading the filesystems to ext3 incase of data loss.
I have all my mounts set to hard. I have had problems with soft mounts. I have seen some file corruption using soft mounting. This is apparently a known problem and it is recomended that soft mounting is only used for read-only systems. At least this is what I have read in the admin books I have. The server contains our source code and so must not become corrupted. I have not seen the Intr option before. I will look into this one and see if it is recomended for a read/write file system.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.