diskless linux
I am trying to setup a beowulf cluster of sorts and have successfully got most of the necessary servers running but am running into a serious road-block mounting the NFS root file system.
When the client node boots up, it successfully gets the pxelinux.0 and begins booting, and hangs with the following error:
INIT: version 2.86 booting
The system is coming up. Please wait.
write locks are prohibited with --ignorelockingfailure.
unable to obtain global lock.
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
/bin/rm: cannot remove '/etc/mtab': Read-only file system
/bin/touch: cannot touch '/etc/mtab': Read-only file system
mount.nfs: cannot touch '/etc/mtab': Read-only file system
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
/etc/rc: line 59: /var/run/utmp: Read-only file system
/bin/mkdir: cannot create directory '/tmp/.ICE-unix': Read-only file system
/sbin/ldconfig: Can't create temporary cache file /etc/ls.so.cache~: Read-only file system
hostname: Marvin
/bin/ln: cannot remove '/etc/localtime': Read-only file system
font: default
keyboard: us
FATAL: Could not open /lib/modules/2.6.30.5/modules.dep.temp for writing: Read-only file system
Here is my current setup with relevant configuration files:
Deepthought is the master node and runs Ubuntu 9.10 Server edition and server the NFS root for the slave nodes
Marvin is the slave node and runs Crux linux 2.6
Eddie is my gateway/router and runs ClearOS, with Dnsmasq and running tftp server.
Here are the relevant configuration files:
On Eddie:
/etc/dnsmasq/dhcp.conf:
dhcp-option=eth1,1,255.255.255.0
dhcp-option=eth1,3,192.168.1.1
dhcp-option=eth1,6,192.168.1.1
dhcp-range=eth1,192.168.1.100,192.168.1.254,12h
read-ethers
dhcp-boot=pxelinux.0
enable-tftp
tftp-root=/var/lib/tftpboot
tftp-no-blocksize
/var/lib/tftpboot/pxelinux.cfg/default:
DEFAULT vmlinuz
APPEND root=/dev/nfs ip=dhcp nfsroot=192.168.1.42:/nodes/marvin init=/sbin/init
IPAPPEND 1
TIMEOUT 3
PROMPT 1
DISPLAY boot.msg
On Deepthought:
/etc/exports:
/nodes/marvin *(rw,no_root_squash)
/nodes/marvin/etc/fstab
192.168.1.42:/nodes/marvin / nfs defaults 1 1
And here is the output of running rpcinfo -p on Deepthought:
100000 2 tcp 111 portmapper
100000 2 udp 111 portmapper
100021 1 udp 34321 nlockmgr
100021 3 udp 34321 nlockmgr
100021 4 udp 34321 nlockmgr
100021 1 tcp 55278 nlockmgr
100021 3 tcp 55278 nlockmgr
100021 4 tcp 55278 nlockmgr
100003 2 udp 2049 nfs
100003 3 udp 2049 nfs
100003 4 udp 2049 nfs
100003 2 tcp 2049 nfs
100003 3 tcp 2049 nfs
100003 4 tcp 2049 nfs
100005 1 udp 40477 mountd
100005 1 tcp 49695 mountd
100005 2 udp 40477 mountd
100005 2 tcp 49695 mountd
100005 3 udp 40477 mountd
100005 3 tcp 49695 mountd
100024 1 udp 47894 status
100024 1 tcp 44363 status
Also, I do get the following error message when booting the server (Deepthought):
80459.508225] NFSD: starting 90-second grace period
[82182.462399] nfsd: last server has exited, flushing export cache
[82183.847695] svc: failed to register lockdv1 RPC service (errno 97).
[82183.850993] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
[82183.851061] NFSD: starting 90-second grace period
I've read through a large number of nfsroot tutorials online and they all have radically different approaches and I'm having a hard time sorting out stuff that's old/outdated. Anyone have any idea what might be causing this error?
|