LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > ROCK
User Name
Password
ROCK This forum is for the discussion of ROCK Linux.

Notices


Reply
  Search this Thread
Old 12-13-2014, 03:36 AM   #1
tuanle
LQ Newbie
 
Registered: Dec 2014
Posts: 4

Rep: Reputation: Disabled
How to connect to a newly installed node via PXE


Hi everybody,
I am just first time with Rocks installation. My assumed small cluster is from FE (IBM x3550) and two nodes (x3650), Cat5 cable with A-B RJ45 connection with an un-managed switch.
Thus, I have 2 questions:
1. With the command "insert-ethers" I have installed two nodes via PXE from scratch. They passed the installation successfully and rebooted. But after rebooting, DHCP could not initiate Linux OS and dieds, leaving a message "Booting from local disk ...". I cannot connect to them, and of course, ssh to compute-0-X was impossible> My nodes are already installed or not?
After node installing, for the next reboot I changed the boot order from "CD-Network-Harddisk0-..." to "CD-Harddisk0-Network-..." and the nodes are seen and it is possible to do commands ssh, rocks sync users ...
2. When I issued mpirun (OpenMPI and other software were installed on the frontend before installing compute nodes) like "mpirun --hostfile myhostfile -np X abc" (X = # of cores, abc = executable), I ran on error that nodes' /usr (and OS in general) is only pure installation, without any software like Intel compilers and OpenMPI as I have on the frontend.

Many thanks to any explanation and suggestion.

Le Tuan,
Hanoi Univ. of Sci. and Technol.

Last edited by tuanle; 12-21-2014 at 03:03 AM.
 
Old 12-22-2014, 05:28 PM   #2
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
Quote:
Originally Posted by tuanle View Post
(..) But after rebooting, DHCP could not initiate Linux OS and dieds,
Any particular error message?


Quote:
Originally Posted by tuanle View Post
(..)After node installing, for the next reboot I changed the boot order from "CD-Network-Harddisk0-..." to "CD-Harddisk0-Network-..." and the nodes are seen and it is possible to do commands ssh, rocks sync users ...
IMHO the "right" boot order would have been Network-Harddisk-CD because you want to use PXE to install the compute node and then Harddisk-Network-CD as the OS and software is installed.


Quote:
Originally Posted by tuanle View Post
When I issued mpirun (OpenMPI and other software were installed on the frontend before installing compute nodes) (..) I ran on error that nodes' /usr (and OS in general) is only pure installation, without any software like Intel compilers and OpenMPI as I have on the frontend.
Install the HPC roll on the compute nodes manually and try again?
 
1 members found this post helpful.
Old 12-22-2014, 09:17 PM   #3
tuanle
LQ Newbie
 
Registered: Dec 2014
Posts: 4

Original Poster
Rep: Reputation: Disabled
Dear unSpawn
Thank you for reply.
1. Only "Booting from local disk ...". After the number of ten time DHCP invoque, the error message only about checking media,cable,etc.
2. I mean that initially the order was "CD-Network-Harddisk0.." as followed from Rocks User Manual, and after the installing via PXE (sometime I put DVD for the installing from DVD, but result the same), I have to change in BIOS the boot order to "CD-Harddisk0-Network-..." for turning on compute nodes.
3. The same thing, if I disconnect compute nodes from the network. Compute nodes have been installed from DVD if DVD is in DVD drive.

About my 2nd question, I was highlighted that I must install application software into frontend's /export/share/apps. But when I copied the tar.gz (installation) files to frontend's /export/share/apps, I could see them only in frontend's /share/apps, but not in compute nodes' /share at all (for it, I made ssh compute-0-X; ls -l /share)
 
Old 12-23-2014, 04:08 AM   #4
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
Quote:
Originally Posted by tuanle View Post
1. Only "Booting from local disk ...". After the number of ten time DHCP invoque, the error message only about checking media,cable,etc.
Next time please also check the frontend DHCP daemon logs?


Quote:
Originally Posted by tuanle View Post
2. I mean that initially the order was "CD-Network-Harddisk0.." as followed from Rocks User Manual, and after the installing via PXE (sometime I put DVD for the installing from DVD, but result the same), I have to change in BIOS the boot order to "CD-Harddisk0-Network-..." for turning on compute nodes.
Well, if that works then at least you have got a workaround :-]


Quote:
Originally Posted by tuanle View Post
About my 2nd question, I was highlighted that I must install application software into frontend's /export/share/apps. But when I copied the tar.gz (installation) files to frontend's /export/share/apps, I could see them only in frontend's /share/apps, but not in compute nodes' /share at all (for it, I made ssh compute-0-X; ls -l /share)
Hmm. No idea how to diagnose or fix that but you could push the HPC roll to each node individually for the time being, yes?
 
1 members found this post helpful.
Old 01-12-2015, 04:01 AM   #5
tuanle
LQ Newbie
 
Registered: Dec 2014
Posts: 4

Original Poster
Rep: Reputation: Disabled
Dear unSpawn,
Now I have to peacefully coexist with the changing the BIOS boot order to "Harddisk 0" first after x3650s' node installation via PXE. I have only several node so it does not difficult at all. Anyway, please tell me, how can I follow the frontend DHCP log?
For the last question I have posted, I realized that I have to issue the path so the compute nodes could find it. So, instead of "/export/apps/Application_paths..." I used "/share/apps/Application_path...". It was because of my naive experience on Rocks.
But it appeared an another difficulty: I found, with the only 10/100Mps switch for making interconnection between the cluster members' "eth0" the time needed for calculation 2-, 3-times longer than it was done by one separated server (the same hardware as the compute nodes)! So, I cannot use more CPU cores from one cluster member for "normal" rate. Using the second, Gigabit one, switch for connecting compute nodes' "eth1", like for InfiniBand scheme, can improve significantly the situation? And, in the case of the improvement, how to connect the frontend to it too?
Thanks for any suggestion.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Need to connect broadband on newly installed fedora 13 GNOME 2.30.0 robert624 Linux - Newbie 6 02-24-2014 05:42 PM
PXE Booting diskless cluster node problems Andy M Linux - Server 1 03-12-2009 04:38 AM
FC8 newly installed, can not connect Internet happyok Fedora 3 01-10-2008 06:18 AM
cannot ping cluster node...pxe-error..help dogma Linux - Newbie 2 09-13-2006 11:34 AM
Cannot boot already installed ubuntu on newly installed win xp abhi_337 Ubuntu 1 07-13-2006 03:33 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > ROCK

All times are GMT -5. The time now is 05:07 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration