LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   DHCP, OSCAR cluster installation, VMware server and PXE-boot problem. (https://www.linuxquestions.org/questions/linux-newbie-8/dhcp-oscar-cluster-installation-vmware-server-and-pxe-boot-problem-715678/)

Azazwa 03-31-2009 12:59 AM

DHCP, OSCAR cluster installation, VMware server and PXE-boot problem.
 
Hi!

I seem to have a talent for doing things at the wrong time, in the wrong order or just in general messing things up. Sigh...

I am installing a little cluster with OSCAR 5.1rc (which I have to do for a postgrad chemistry project. Yes, the connection between clusters and chemistry isn't immediately apparent, but it's there...)
At the second last step, when the cluster should be completed, it failed. There were problems with torque and maui finishing the configuration. I have a server node, and added one normal compute node.

Then, perhaps because my supervisor was wanting to see some progress, and because I had a few hours in which I couldn't continue with other work, I loaded VMware server 2.0 on the server node as we have a small partition for XP. (Running Fedora 8 on the larger partition).

Now, when I wanted to add some more nodes to the cluster, they don't manage to boot by way of the network. This is what I get on the nodes:

PXE-E11 : ARP timeout
PXE-E11 : ARP timeout
PXE-E38 : TFTP cannot open connection
PXE-M0F : Exiting Intel Boot Agent

So I guess the trouble-maker is the VMware and the DHCP. The VMware install said the following:

Code:

This system appears to have a DHCP server configured for normal use.  Beware
that you should teach it how not to interfere with VMware Server's DHCP server.
There are two ways to do this:

1) Modify the file /etc/dhcpd.conf to add something like:

subnet 10.0.0.0 netmask 255.0.0.0 {
# Note: No range is given, vmnet-dhcpd will deal with this subnet.
}

2) Start your DHCP server with an explicit list of network interfaces to deal
with (leaving out vmnet1). e.g.:

dhcpd eth0

Consult the dhcpd(8) and dhcpd.conf(5) manual pages for details.

Perhaps the above makes everything crystal-clear to people who are not as clueless as I am. Firstly, I don't have a dhcpd.conf file in /etc. Now that I have loaded VMware, I have this /etc/vmware/vmnet8/dhcpd which looks like this:

Code:

#
# Configuration file for ISC 2.0b6pl1 vmnet-dhcpd operating on vmnet8.
#
# This file was automatically generated by the VMware configuration program.
# If you modify it, it will be backed up the next time you run the
# configuration program.
#
# We set domain-name-servers to make some DHCP clients happy
# (dhclient as configued in SuSE, TurboLinux, etc.).
# We also supply a domain name to make pump (Red Hat 6.x) happy.
#
allow unknown-clients;
default-lease-time 1800;                # 30 minutes
max-lease-time 7200;                        # 2 hours

subnet 192.168.0.0 netmask 255.255.0.0 {
    range 192.168.128.0 192.168.255.254;
    option broadcast-address 192.168.255.255;
    option domain-name-servers 192.168.0.2;
    option domain-name "localdomain";
    option routers 192.168.0.2;
}


and similarly for vmnet1. Is there some other dhcpd file that I should change as shown in 1) above?

Secondly, I don't really know how to do 2). Is 2) a once off thing, or what?

Thirdly, I don't need VMware urgently until the cluster is up and running. Could I uninstall VMware, and just install it again later? Hm, but I'm afraid it might cause problems again. I would prefer to sort out the problems now, and not have some mysterious problems popping up later.

Any advice would be greatly appreciated!

Azazwa 03-31-2009 03:17 AM

Hi! I uninstalled VMware, but now I am getting this:
PXE-E32: TFTP open timeout

My /etc/dhcpd.conf file (which I mandaged to find in the meantime)
looks like this.
Code:

####################################################################
# This dhcpd.conf file was generated by the systeminstaller command
# mkdhcpconf. It reflects the contents of the CLAMDR database.
# File generated at 9:51:26 on 3/31/2009
####################################################################

deny unknown-clients;
option subnet-mask 255.255.0.0;
option broadcast-address 192.168.255.255;
option domain-name "up.ac.za";
option routers 192.168.1.254;
ddns-update-style none; # For dhpcd version 3

# Defined cluster nodes...
subnet 192.168.0.0 netmask 255.255.0.0 {
        group {
                host normnode1{
                        hardware ethernet 00:1C:C0:AF:10:01;
                        fixed-address 192.168.1.1;
                        filename "pxelinux.0";
                        option routers 192.168.1.254;
                        option domain-name "up.ac.za";
                        next-server oscar_server;
                }
                host normnode2{
                        hardware ethernet 00:1C:C0:AF:0F:FC;
                        fixed-address 192.168.1.2;
                        filename "pxelinux.0";
                        option routers 192.168.1.254;
                        option domain-name "up.ac.za";
                        next-server oscar_server;
                }
                host normnode3{
                        hardware ethernet 00:1C:C0:AF:0F:CF;
                        fixed-address 192.168.1.3;
                        filename "pxelinux.0";
                        option routers 192.168.1.254;
                        option domain-name "up.ac.za";
                        next-server oscar_server;
                }
        }
}

# This entry ignores requests on eth1...
subnet 137.215.104.0 netmask 255.255.255.0 {
        not authoritative;
}

Would it make any difference to add

Code:

allow bootp;
allow booting;

before #Defined cluster nodes in the above file?

Any advice would be appreciated!

Azazwa 03-31-2009 06:50 AM

I added the "allow's" and it didn't make any difference.
I'm rather depressed.

Azazwa 03-31-2009 10:52 AM

Solved the problem
 
For the PXE-E32: TFTP open timeout problem, disable your firewall properly. That is, don't do it via the gui. Use #service iptables stop, and check these sites

http://www.dbapool.com/forumthread/topic_1069.html
http://forums.fedoraforum.org/archiv...p/t-31587.html

Hope it helps for those who have the same error as I had.


All times are GMT -5. The time now is 04:23 AM.