program run from nfs-mounted directory behaves badly
Other *NIXThis forum is for the discussion of any UNIX platform that does not have its own forum. Examples would include HP-UX, IRIX, Darwin, Tru64 and OS X.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
program run from nfs-mounted directory behaves badly
I have a FORTRAN program (not written by me) that has the executable installed on our SGI server. From my Octane workstation, I nfs-mount the directory where that program resides, and execute it. The program does two strange things.
1) It requires over 1 minute to start the program, while the normal initialization takes 11 seconds if run from a local disk. Other programs do not have this problem. I talk to friends who have a similar installation, and they don't see this issue.
2) When the program exits normally, it kills the local shell in which it was being run. Again, other programs don't do this, and similar installations elsewhere don't have this weird behavior.
Not sure if this is it since you indicate it does run eventually but will note it:
NFS in Linux doesn't (or didn't when I ran into this a year ago) support flock for file locking very well. Apparently they even removed support from the version of the kernel I was working on then (can't recall what was but THINK it was the RH EL AS 2) due to this issue. On deploying a couple of different applications I found I had to move certain parts of the application onto internal storage so it would work properly.
The fix one of the vendors' developers suggested he was going to try was coming up with his own mechanism using semaphores instead of flock. Since I'm not a programmer I couldn't say how viable that was.
Hmm. That is interesting. It might involve file locking, I suppose, since others might make use of the same software, library files, etc.
Your other comment regarding having certain parts of the program stored on a local disk is practical in theory, but how to figure out what components are causing the problem? It could be one of several.... brute force trial and error, I suppose.
Writing code to by-pass or augment file locking is much more work than I am willing to attempt.
Thanks for your reply, it gives me something to think about.
Yep. Unfortunately for one app we were able to clearly identify the piece that was having issues with flock. For the other the vendor told us "everything uses flock" which was likely BS but we didn't have time for trial and error so just moved it all to internal storage since they committed to coming up with a long term solution to let us move it back laer. I left that job before they did so don't know if they did.
Also recently talking to NetApps they said they had never heard of this issue. It was on NetApps fileres we had it. Since I saw it on two different apps and had documentation from RedHat at the time regarding it I'm sure it did exist. It occurred to me their statement might be because it was solved in later kernels. Or it may have been fairly rare in the first place. For MOST of our Oracle Financials' app it was not an issue.
Just looked back at my notes and they say that flock was removed from Red Hat Linux 3.x and we had considered going back to Red Hat Linux 2.x because it was still there. The note indicates it was removed from 3.x because it was unreliable in 2.x. Since I was new to Linux at the time I suspect the 2.x and 3.x I wrote means AS 2 and AS 3 since we were using their commercial stuff.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.