Seeking recommendations for thread-safe garbage collection solution for C
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Seeking recommendations for thread-safe garbage collection solution for C
Hello! I'm an experienced software engineer, with a background in mostly C and perl.
I've never used a garbage collection solution for C, and am starting a project where I think I would like to try using one (not for C++, just plain old C).
I'd need it to be thread-safe (I expect to use pthreads), and prefer it to not depend on special "smart pointer" types. I'd also like it to be able to handle complex types (e.g. if a struct contains pointers, then when the struct gets collected so do any of its unreferenced substructures), and convenient to use.
Performance is not a priority; when I need a critical code path to be lean and fast, I'll just use lexical-scope variables or malloc/free in the critical path, and keep the GC outside of it.
My target and dev environments are Linux, so Windows-specific solutions won't be useful.
I've looked at TinyGC and BoehmGC, and they're not really what I'm looking for. If I can't find a really good fit, I may just use perl for the non-critical path and Inline::C for the critical path(s). Or maybe write my own GC library.
You've misunderstood something: C is not about garbage collection. On the other hand, you might want to use separate heaps that can be freed in one call. (I think obstack is an example for that.)
At first glance, there seemed to be a lot to dislike about BoehmGC, but after thinking about it a while and reading the documents SoftSprocket linked, I'm thinking of giving it a shot and seeing how well it works in practice.
I didn't like that it periodically scanned all of memory for pointer'ish-looking bytes, especially since my RSS can get very large, but perhaps I can convince it to only scan a fraction of memory via disciplined use of GC_malloc_atomic() and GC_malloc_ignore_off_page(). I'm not finding a function like GC_malloc_atomic() which could be used to mark an existing object (allocated by a third party library, for instance), but maybe I can avoid the need.
This also matters a lot less once I saw I could gc_gcollect() + gc_disable() immediately before entering the critical path, and gc_enable() upon leaving it. Outside of my critical path I don't care much about performance (in fact my usual modus operandi is to use perl to implement non-performance-intensive logic, and use Inline::C to implement the critical path).
I also didn't like that I'd have to set pointers to NULL to signal to the collector that an object could be collected, but the more I think about this, the less of an issue it seems. Especially if the collector is smart enough to know that any pointers in stack frames below the one pointed to by the stack pointer should not be considered valid (is it? I will test this and see).
I'm still wrapping my head around its thread-specific API bits.
Lots of long-running processes use the "hara-kiri" technique. After a server processes a few thousand or hundred-thousand requests, it terminates itself and is promptly re-spawned by its parent. Apache provides this service in several different ways, e.g. for FastCGI work. It's a little draconian, maybe, but very practical.
Another useful technique, which I first learned in my mainframe days, is "subpool-based memory allocation." This is usually implemented by a software layer that sits on top of malloc() and its brethren. The idea is that, when starting a new request, the application requests a "subpool." Then, explicitly or implicitly, it references that "subpool number" in reference to all requests. Furthermore, it only allocates memory ... there usually is no equivalent of "free()." (If memory blocks are no longer needed and are to be recycled, the application must maintain some kind of free-list for this purpose ... or, simply, just ignore the block and let it float away.) At the end of the request, the application frees the subpool, which causes all of the memory that had been allocated under its auspices to be released at that time.
Subpooling won't help you with things like "memory scribbles" and stack-corruption, but it is a handy way to avoid memory leaks in long-running applications.
Well, what I would do is, design libraries for the data structures you need and the operations on them and test them thoroughly.
I wouldn't ponce around doing something not directly related to the project, especially something as difficult as GC, you could find yourself running out of time.
And if the boss asks how you are getting on, he may not take kindly to "I am implementing a garbage collector I hope to start the project soon"
Unless you are in a luxury job with unlimited reources
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.