LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Other *NIX (https://www.linuxquestions.org/questions/other-%2Anix-55/)
-   -   program run from nfs-mounted directory behaves badly (https://www.linuxquestions.org/questions/other-%2Anix-55/program-run-from-nfs-mounted-directory-behaves-badly-363258/)

jabel5 09-14-2005 08:25 AM

program run from nfs-mounted directory behaves badly
 
I have a FORTRAN program (not written by me) that has the executable installed on our SGI server. From my Octane workstation, I nfs-mount the directory where that program resides, and execute it. The program does two strange things.
1) It requires over 1 minute to start the program, while the normal initialization takes 11 seconds if run from a local disk. Other programs do not have this problem. I talk to friends who have a similar installation, and they don't see this issue.
2) When the program exits normally, it kills the local shell in which it was being run. Again, other programs don't do this, and similar installations elsewhere don't have this weird behavior.

Any idea what is going on here?

MensaWater 10-03-2005 02:01 PM

Not sure if this is it since you indicate it does run eventually but will note it:

NFS in Linux doesn't (or didn't when I ran into this a year ago) support flock for file locking very well. Apparently they even removed support from the version of the kernel I was working on then (can't recall what was but THINK it was the RH EL AS 2) due to this issue. On deploying a couple of different applications I found I had to move certain parts of the application onto internal storage so it would work properly.

The fix one of the vendors' developers suggested he was going to try was coming up with his own mechanism using semaphores instead of flock. Since I'm not a programmer I couldn't say how viable that was.

jabel5 10-04-2005 04:53 PM

Hmm. That is interesting. It might involve file locking, I suppose, since others might make use of the same software, library files, etc.

Your other comment regarding having certain parts of the program stored on a local disk is practical in theory, but how to figure out what components are causing the problem? It could be one of several.... brute force trial and error, I suppose.

Writing code to by-pass or augment file locking is much more work than I am willing to attempt.

Thanks for your reply, it gives me something to think about.

MensaWater 10-05-2005 08:40 AM

Yep. Unfortunately for one app we were able to clearly identify the piece that was having issues with flock. For the other the vendor told us "everything uses flock" which was likely BS but we didn't have time for trial and error so just moved it all to internal storage since they committed to coming up with a long term solution to let us move it back laer. I left that job before they did so don't know if they did.

Also recently talking to NetApps they said they had never heard of this issue. It was on NetApps fileres we had it. Since I saw it on two different apps and had documentation from RedHat at the time regarding it I'm sure it did exist. It occurred to me their statement might be because it was solved in later kernels. Or it may have been fairly rare in the first place. For MOST of our Oracle Financials' app it was not an issue.

Just looked back at my notes and they say that flock was removed from Red Hat Linux 3.x and we had considered going back to Red Hat Linux 2.x because it was still there. The note indicates it was removed from 3.x because it was unreliable in 2.x. Since I was new to Linux at the time I suspect the 2.x and 3.x I wrote means AS 2 and AS 3 since we were using their commercial stuff.


All times are GMT -5. The time now is 07:34 PM.