ROCK This forum is for the discussion of ROCK Linux. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
12-13-2014, 03:36 AM
|
#1
|
LQ Newbie
Registered: Dec 2014
Posts: 4
Rep:
|
How to connect to a newly installed node via PXE
Hi everybody,
I am just first time with Rocks installation. My assumed small cluster is from FE (IBM x3550) and two nodes (x3650), Cat5 cable with A-B RJ45 connection with an un-managed switch.
Thus, I have 2 questions:
1. With the command "insert-ethers" I have installed two nodes via PXE from scratch. They passed the installation successfully and rebooted. But after rebooting, DHCP could not initiate Linux OS and dieds, leaving a message "Booting from local disk ...". I cannot connect to them, and of course, ssh to compute-0-X was impossible> My nodes are already installed or not?
After node installing, for the next reboot I changed the boot order from "CD-Network-Harddisk0-..." to "CD-Harddisk0-Network-..." and the nodes are seen and it is possible to do commands ssh, rocks sync users ...
2. When I issued mpirun (OpenMPI and other software were installed on the frontend before installing compute nodes) like "mpirun --hostfile myhostfile -np X abc" (X = # of cores, abc = executable), I ran on error that nodes' /usr (and OS in general) is only pure installation, without any software like Intel compilers and OpenMPI as I have on the frontend.
Many thanks to any explanation and suggestion.
Le Tuan,
Hanoi Univ. of Sci. and Technol.
Last edited by tuanle; 12-21-2014 at 03:03 AM.
|
|
|
12-22-2014, 05:28 PM
|
#2
|
Moderator
Registered: May 2001
Posts: 29,415
|
Quote:
Originally Posted by tuanle
(..) But after rebooting, DHCP could not initiate Linux OS and dieds,
|
Any particular error message?
Quote:
Originally Posted by tuanle
(..)After node installing, for the next reboot I changed the boot order from "CD-Network-Harddisk0-..." to "CD-Harddisk0-Network-..." and the nodes are seen and it is possible to do commands ssh, rocks sync users ...
|
IMHO the "right" boot order would have been Network-Harddisk-CD because you want to use PXE to install the compute node and then Harddisk-Network-CD as the OS and software is installed.
Quote:
Originally Posted by tuanle
When I issued mpirun (OpenMPI and other software were installed on the frontend before installing compute nodes) (..) I ran on error that nodes' /usr (and OS in general) is only pure installation, without any software like Intel compilers and OpenMPI as I have on the frontend.
|
Install the HPC roll on the compute nodes manually and try again?
|
|
1 members found this post helpful.
|
12-22-2014, 09:17 PM
|
#3
|
LQ Newbie
Registered: Dec 2014
Posts: 4
Original Poster
Rep:
|
Dear unSpawn
Thank you for reply.
1. Only "Booting from local disk ...". After the number of ten time DHCP invoque, the error message only about checking media,cable,etc.
2. I mean that initially the order was "CD-Network-Harddisk0.." as followed from Rocks User Manual, and after the installing via PXE (sometime I put DVD for the installing from DVD, but result the same), I have to change in BIOS the boot order to "CD-Harddisk0-Network-..." for turning on compute nodes.
3. The same thing, if I disconnect compute nodes from the network. Compute nodes have been installed from DVD if DVD is in DVD drive.
About my 2nd question, I was highlighted that I must install application software into frontend's /export/share/apps. But when I copied the tar.gz (installation) files to frontend's /export/share/apps, I could see them only in frontend's /share/apps, but not in compute nodes' /share at all (for it, I made ssh compute-0-X; ls -l /share)
|
|
|
12-23-2014, 04:08 AM
|
#4
|
Moderator
Registered: May 2001
Posts: 29,415
|
Quote:
Originally Posted by tuanle
1. Only "Booting from local disk ...". After the number of ten time DHCP invoque, the error message only about checking media,cable,etc.
|
Next time please also check the frontend DHCP daemon logs?
Quote:
Originally Posted by tuanle
2. I mean that initially the order was "CD-Network-Harddisk0.." as followed from Rocks User Manual, and after the installing via PXE (sometime I put DVD for the installing from DVD, but result the same), I have to change in BIOS the boot order to "CD-Harddisk0-Network-..." for turning on compute nodes.
|
Well, if that works then at least you have got a workaround :-]
Quote:
Originally Posted by tuanle
About my 2nd question, I was highlighted that I must install application software into frontend's /export/share/apps. But when I copied the tar.gz (installation) files to frontend's /export/share/apps, I could see them only in frontend's /share/apps, but not in compute nodes' /share at all (for it, I made ssh compute-0-X; ls -l /share)
|
Hmm. No idea how to diagnose or fix that but you could push the HPC roll to each node individually for the time being, yes?
|
|
1 members found this post helpful.
|
01-12-2015, 04:01 AM
|
#5
|
LQ Newbie
Registered: Dec 2014
Posts: 4
Original Poster
Rep:
|
Dear unSpawn,
Now I have to peacefully coexist with the changing the BIOS boot order to "Harddisk 0" first after x3650s' node installation via PXE. I have only several node so it does not difficult at all. Anyway, please tell me, how can I follow the frontend DHCP log?
For the last question I have posted, I realized that I have to issue the path so the compute nodes could find it. So, instead of "/export/apps/Application_paths..." I used "/share/apps/Application_path...". It was because of my naive experience on Rocks.
But it appeared an another difficulty: I found, with the only 10/100Mps switch for making interconnection between the cluster members' "eth0" the time needed for calculation 2-, 3-times longer than it was done by one separated server (the same hardware as the compute nodes)! So, I cannot use more CPU cores from one cluster member for "normal" rate. Using the second, Gigabit one, switch for connecting compute nodes' "eth1", like for InfiniBand scheme, can improve significantly the situation? And, in the case of the improvement, how to connect the frontend to it too?
Thanks for any suggestion.
|
|
|
All times are GMT -5. The time now is 05:07 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|