LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Many questions about clustering software (https://www.linuxquestions.org/questions/linux-software-2/many-questions-about-clustering-software-867309/)

VolumetricSteve 03-08-2011 05:36 PM

Many questions about clustering software
 
My problem is very simple, but the more I look into it, the more complex the solution seems. I have one program (q3map2) which is the map compiler for Quake 3 Arena. It can be multithreaded, or single threaded simply by adding a switch at the command line (-threads X). Because of the complexity of the maps I give it, the compile times are several hours even on my 6-core AMD overclocked to 4GHz. So, since I can buy more computers just like it for 500-ish bucks (if you want a list of parts I'd be happy to supply it) I was thinking I could make a Beowulf-Cluster out of a few of them.

The problem I'm having is that I can't find any clear explanations for how different kinds of clustering that may or may not be available to me.

*I don't know if I need to rewrite q3map2 to make it work in a cluster.

*I don't know if I only need to recompile q3map2 to make it work in a cluster

*I don't know if there's even a cluster OS that would just transparently run q3map2 across a bunch of machines.

I really don't know where to start and what's throwing me off the most is that a lot of the terminology is very confusing. A lot of the time I see "Single System Image" thrown round, and it is a cluster with shared resources, but what they mean is the program that's using the most power will go to the most powerful system in the cluster (this would be totally useless to me), it will not be shared evenly across all systems in order to boost performance. Or maybe I'm even misunderstanding that. Is anyone here a cluster expert?


Thanks, sorry for the long post.

stress_junkie 03-08-2011 08:23 PM

I don't have all the answers to your questions. I can just give you a pointer. The Beowulf type cluster was improved upon by Open Mosix. Unfortunately the Open Mosix project was abandoned in 2008.

http://openmosix.sourceforge.net/

I don't know of another cluster project focused on this kind of work load sharing. Even these projects were focused on distributing independent jobs over available nodes, not on distributing threads of a single job.

VolumetricSteve 03-09-2011 10:01 AM

Oh yeah, I've been all over OpenMosix. Right now, I'm making a list of all of the potential software I can use to make this happen, and once I do...I'm going to write the single best source of documentation for clustering in existence because everything I've come across has been garbage so far. I don't know how people do this as convoluted as it appears at first glance. While understanding..with linux and clustering, you have to expect to get your hands dirty, but I feel ill-equiped at every turn and I consider myself at least fairly tech-savy - at a minimum, I can dig through mountains of documentation as long as there's documentation.

To put it in plain english, if you want a cluster of computers that'll all appear as one computational resource so that you have 1 program running on many machines, you're looking for a Single System Image Cluster. More specifically, you're looking for a Beowulf-Cluster, however I haven't seen this term used much on any of the software I've been looking at.

http://en.wikipedia.org/wiki/Single_system_image

Right now, I'm working with the developers of Kerrighed to figure out exactly what it does and how it works. A developer told me:

"If you can
configure your application to run many processes, then Kerrighed should be able
to distribute the load accross the cluster.

Then two distinct mechanisms of Kerrighed, with different limitations can
distribute the load:
1) Process migration enables dynamic load balancing, but it is restricted to
single-threaded processes.
2) Remote fork allows to balance the load when forking new processes. Those
processes can be multi-threaded."

So, when considering that, it seems as if I could take my program (q3map2) and configure it to run in a single thread, which is easy to do....and then Kerrighed should distribute it across the cluster. However, The term "dynamic load balancing" could still mean it's just moving the process to the most available node, and only running it on that node, and it'd be dynamic in the sense it'll move that process completely to another node if that node becomes too busy. If that's true, it's useless to me, so I've asked for more clarification.


I'll post more findings as I make them.


All times are GMT -5. The time now is 11:28 PM.