LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Odd permission(?) issue over NFS mount (https://www.linuxquestions.org/questions/linux-general-1/odd-permission-issue-over-nfs-mount-4175543147/)

suicidaleggroll 05-20-2015 12:02 PM

Odd permission(?) issue over NFS mount
 
I'll try to keep this simple. I'm having an odd problem on one of my servers when writing to another server over NFS. The user has the same UID/GID on both machines and full ownership over the directory (and its parent, and its parent, etc.), so that's not the issue.

The problem is remote execution of a script through Torque/PBS, the script basically looks like:
Code:

#PBS -l walltime=0:05:00
#PBS -l nodes=1:ppn=1
#PBS -o pbs_outfile.out
#PBS -j oe
#PBS -V
#PBS -m a
#PBS -M user@domain.com
echo "Job started on `hostname` at `date`"
cd /path/to/run_location
/path/to/binary < input_file >& /path/to/run_location/output_file
echo "Job ended on `hostname` at `date`"

Where /path/to in all cases is on this NFS mount. The script exits with the error:
/path/to/run_location/output_file: No such file or directory

Obviously the output_file doesn't exist, it's supposed to be writing to it, but its parent directory exists, clearly it exists since it was able to cd to it and write pbs_outfile.out in that same directory. What's even weirder is after this error occurs, this is the following behavior in the terminal on the client machine:
Code:

$ cd /path/to/run_location
$ touch output_file
touch: cannot touch output_file: No such file or directory
$ touch dummy_file
$ touch output_file

And it works without issue.

This is not a problem on any other client machines using the same user with the same UID/GID, and the NFS export is not picky about IP as long as it's on the same subnet (which it is).

I am unable to replicate this odd "No such file or directory" behavior except through the torque server/client interface on this one client.

The server is running CentOS 6, the client is CentOS 7, both using the same version of torque.

Any thoughts?

smallpond 05-20-2015 12:34 PM

NFS v3 uses uid and gid. NFS v4 has idmap to map arbitrary users on one machine to users on another machine. Which are you using? Note that CentOS 7 will default to v4.

suicidaleggroll 05-20-2015 12:49 PM

It looks like that was it! I remounted with NFS v3 and everything seems to be going through just fine now.

How are you supposed to handle permissions properly with NFS v4 then? The UID/GID and user/group names all matched on both systems, and I was able to write to the directory without issue on the client which made me think the perms were fine, there was just this weird intermittent "No such file or directory" problem when launching through torque.


All times are GMT -5. The time now is 03:47 AM.