ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Hi. I'm interested in parallel programming (lots of parallel number crunching) with Scheme, on a Linux-based supercomputer. My thought was to start with simpler multithreaded programming to take advantage of many cores on a single node, and then afterwards look into expanding to taking advantage of multiple nodes.
But I'm having trouble figuring out which Scheme implementation / modules to use. I was excited about Gambit-C Scheme, because of great performances promises and an elaborate multi-threading infrastructure... but then I read in the documentation that all the threads run on one core only! Guile, I think, has some nifty functions for distributing calculations to multiple processes, but guile is just interpreted, so I'm not sure what kind of performance all that would mean.
I'm finding lots of old (20+ year old) research papers about possibilities, but no clear information about what is available now. I would appreciate any insight or guidance.
I was excited about Gambit-C Scheme, because of great performances promises and an elaborate multi-threading infrastructure... but then I read in the documentation that all the threads run on one core only!
Does that mean it emulates multithreading with some sort of userland threads? Otherwise, I'm not quite sure how a cross-platform implementation would always do that. Since it compiles to C, there has to be some way to access the pthread API, even if you just write an extension that you reuse.
Multithreading is more trouble than it's worth.
You'll spend 90% of your effort working on it instead of the problem and get zero performance advantage (or less).
And then you won't be able to debug it.
And i can't see why you would use scheme for number crunching, use C if you want performance.
@ta0kira: That's the way I understood it. Basically, it is userland threads for concurrency. As far as accessing a pthread API - I don't know - I was hoping you all might know more. I was sort of hoping for a more "prefab" solution, rather than having to dive into the inner details of the language and write my own extensions and parallel programming infrastructure. But I'm open minded.
@bigearsbilly: I'm not quite sure I follow you. If I have access to a supercomputer node with 16 cores, I don't see how I am going to take advantage of that without multithreading. All the C programmers around here are using openmp - which is multithreading. Maybe you can clarify what you mean...
C certainly has a proven track record, but performance is not the only thing I'm interested in. Scheme is one of the functional languages, and there are a lot of intriguing and elegant ideas integrated into it. Besides, the Lisps were one of the pioneering families of HPC back in the early days.
Bigloo has both POSIX threads and userspace ("fair") threads:
Quote:
Bigloo supports multithreaded programming. Two different libraries programming are available. The first one, the Fair Thread (see Section Fair Threads), enables, simple, easy to develop and to maintain code. The second one, the Posix Thread (see Section Posix Threads) enables more easily to take benefit of the actual parallelism that is now available on stock hardware. Because it is easier to program with fthread than with pthread, we strongly recommend to use the former as much as possible and leave the former for specially demanding applications. Both libraries are described in this chapter.
...
The Fair Threads library is ``Posix Threads'' safe, which means it is possible to use at the same time both libraries. In other words, it is possible to embed one fair scheduler into a Posix thread.
It seems like most schemes have userspace threads. Probably because userspace threads look a lot like continuations.
@bigearsbilly: I'm not quite sure I follow you. If I have access to a supercomputer node with 16 cores, I don't see how I am going to take advantage of that without multithreading. All the C programmers around here are using openmp - which is multithreading. Maybe you can clarify what you mean...
Do you have access to 16 cores? Cripes!
I guess when I have 16 cores I may try MT again, but still I am not sure what advantages 16 threads would have over 16 processes.
I just prefer simplicity myself after years of experience and trying to maintain complex code.
To be honest work is slack and I have no friends to talk to.
Multithreading is more trouble than it's worth.
You'll spend 90% of your effort working on it instead of the problem and get zero performance advantage (or less).
And then you won't be able to debug it.
Quote:
Originally Posted by bigearsbilly
Do you have access to 16 cores? Cripes!
I guess when I have 16 cores I may try MT again, but still I am not sure what advantages 16 threads would have over 16 processes.
I just prefer simplicity myself after years of experience and trying to maintain complex code.
These statements are way too broad. Mathematical programming is quite a bit different than other types of programming. When you're performing scientific computations you can't get any simpler than the computation you're performing, and in a lot of cases the fidelity of your computation is a direct result of how much you can compute in a given timeframe. Some things can be split into separate processes (e.g. running the same simulation 1 million times,) but others shouldn't be (e.g. multiplying two large matrices.) It really depends on the nature of the computation and what information needs to be shared among parallel components.
I do agree that if you don't properly account for I/O in your threading decisions you can end up with substantially-worse performance when trying to parallelize. In general, though, wise decisions when parallelizing can give you impressive increases in performance, even if those increases aren't linear with respect to the number of threads/processes.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.