LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Is setting up a linux Cluster difficult? (https://www.linuxquestions.org/questions/linux-newbie-8/is-setting-up-a-linux-cluster-difficult-4175534875/)

Roy.Geer 02-23-2015 11:51 AM

Is setting up a linux Cluster difficult?
 
hello

I want to setup two desktops each one having 4 cores and combine them to have 8 cores with linux clustering. Is this how clustering works and what level of difficulty is it?

Also, any good howto sites on the subject is appreciated. Thanks

suicidaleggroll 02-23-2015 12:00 PM

It depends on what you're trying to do. If you just want one heavy process to run in 8 threads on both machines, you should look into MPI. If you want to be able to run many simultaneous processes and have them automatically farmed out to the two machines based on load, memory usage, etc. then look into the TORQUE resource manager.

If you want your general purpose computer usage (opening spreadsheets, watching youtube, etc.) to use the processing power of both machines like it's one bigger machine, it's not going to happen, at least not in a way that would actually speed things up. Mostly because the CPU is rarely the bottleneck for those kinds of applications, and the latency when communicating between the two systems will slow things way down.

Roy.Geer 02-23-2015 12:10 PM

I would like to do video encoding, I guess this would fall to MPI clustering.

suicidaleggroll 02-23-2015 12:19 PM

Only if you build your own video encoder and can program in all of the MPI hooks, or you use one with distributed encoding already built in, eg: x264farm, RipBot264, MediaEncodingCluster, etc. Note that I have no experience with any of these, it's just what I found with a quick google search.

Roy.Geer 02-23-2015 12:31 PM

Quote:

Originally Posted by suicidaleggroll (Post 5322103)
Only if you build your own video encoder and can program in all of the MPI hooks, or you use one with distributed encoding already built in, eg: x264farm, RipBot264, MediaEncodingCluster, etc. Note that I have no experience with any of these, it's just what I found with a quick google search.

Same here. It's almost not worth doing a cluster then. I had the assumption that any application would utilize all resources from a cluster.

I thank you suicidaleggroll for the useful information about the types of clustering and other related info.

JeremyBoden 02-23-2015 12:37 PM

I read somewhere that video encoding isn't usually written to use no more than 4 cores.
From experience, it will definitely use at least 4 cores.

Roy.Geer 02-23-2015 12:46 PM

Quote:

Originally Posted by JeremyBoden (Post 5322111)
I read somewhere that video encoding isn't usually written to use no more than 4 cores.
From experience, it will definitely use at least 4 cores.

Perhaps, BUT...

Movies are rendered in clusters with thousands of cores, but as suicidaleggroll said earlier, they're probably programmed their own custom software to take advantage of the cores.

btmiller 02-24-2015 07:49 AM

Movie rendering is a type of problem that is called "embarassingly parallel" because each frame is independent of any other. Therefore, if you have 5,000 quad core computers, you can use them to render 5,000 frames simultaneouslky. Usually these work in a master-worker paradigm. One computer is the master, which distributes work to the other computers. When a computer finioshes its frame, it goes back and asks the master for more work. I think DrQueue is popular as software to run on the master for workload distribution.

Most large-scale high performance computing clusters, by contrast, are designed to run a tightly-coupled parallel application written using MPI or a similar paradigm like PGAS. In otherwords, the calculations being performed on one computer are tightly coupled to those performed on other computers, unlike the movie rendering case. At my work, we have a large-ish cluster that is used to run molecular simulations, for example. If you look at top500.org - the list of the top 500 reported most powerful computer systems - you'll see that they're all clusters. Usually they have a specialized parallel interconnect (e.g. InfiniBand) that allows high bandwidth, low latency message passage. For workload management, these tend to use TORQUE, SLURM, or one of the grid engine derivatives.


All times are GMT -5. The time now is 04:57 PM.