ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
I have been converting a statistical analysis utility in C from single-threaded design to multi-thread using POSIX threads (pthread). It ran fine on a 32-bit Fedora (Intel dual core), and gave a 30% improvement over the single-thread version. When I moved to a new quad-core, 64-bit machine (64-bit Fedora), it fails with a segmentation fault.
Other forums pointed out issues with pthread on 64-bit machines, and recommended a fix of adding -ldl to the compile line, but this did not help. It is compiled with -Wall with no warnings. I believe the code to be thread safe, and it runs fine on a 32-bit machine.
Why not use fork? This utility spawns 480,000 threads over its execution, and all threads read (never write) the same shared 1 GB dataset (initialized via file before entering the multithreaded tasks). pthread has the advantage of not making a copy of the heap space, so each thread can access the data.
Lastly, the fault occurs before any pthreads have been created, and all output (stdout and stderr) is fflush'd to find the exact spot where the fault occurs. It fails at the first declaration of an integer of around 250,000 elements. If I drop the array to 200,000 in a sample function, it does not fault. At 300,000 it always fails. In this snippet:
bool testFunc(int numberThreads, GlobalParams *globals)
printf("In testFunc A\n");fflush(stdout);fflush(stderr);
The printf produces no output, and it faults. If I change the array size to 200000, printf works and there is no fault.
By the way, the original single-threaded utility ran 17 days for a single run. I am hoping this i7 920 can do the job in a few days with 8 threads, but I am stuck.
You shouldn't put that much data on the stack. Because you're on a 64-bit machine, int is 8 bytes; twice the size of one on a 32-bit machine. I assume you're getting a stack overflow. Try making it static or using malloc. You also need to run a backtrace (e.g. in gdb) to find out the exact line it happens (might be somewhere else.)
1. I totally agree with ta0kira: stack overflow could easily be the root cause.
2. For whatever it's worth, I *strongly* recommend "fprintf (stderr, ...)" over "printf (); fflush (stdout)". In my experience, the latter simply doesn't work most of the time... (or, more precisely, "doesn't work the way you expect it when you most need it to"!)
Thanks, but I almost put a note saying "please don't tell me not to do this... I have to."
I have got away from doing programming lately, so maybe I have forgotten what the stack is.
I have to spawn 480,000 threads, and I did not make it clear that there is only one thread active per CPU. So there are never more than 8 additional threads, or 9 counting the supervisor. Of course, I would never spawn 480,000 threads at once, but I failed to state that. Does that change the stack recommendation, or am I missing the point? The idea is to work the long list 8 at a time until complete, one thread per CPU, roughly. I did not use CPU affinity, BTW.
Also, I just added ifdef and macros so I can switch off multitasking while using the same functions for the work, and it runs fine that way. I have pthread_create reference a wrapper for the function that does the work, so I can call it either via pthread_create or directly (switchable by macro). The structures are complex, so I wanted to validate all the pointers and malloc stuff before adding threads.
I will change all the printf. It is really better code to do it that way, and I was being sloppy. It allows far better output control, too. Thanks for the fast response!
This is pretty much one-time use to create a datafile needed for another program, but I can convert the large arrays to pointers. I don't declare anything large inside the threads, just for managing them. I can start changing to pure pointers and malloc, and post back with whether it fixes the problem. Still not sure why it worked so well on the 32-bit OS. I could just give up and let it run on the 32, but waiting 17 days per iteration really slows down our progress.
(sorry for the confusion because I refreshed and created a dup question)
Paulsm4: I moved the large array declarations to global per your suggestion, and that fixed it, at least for my limited test runs. Now I need to inspect the rest of the code for places that may cause a similar problem later. Because of the design, it was much easier to move the two large arrays to global than to change to pointers/malloc. Getting sloppy in my old age.
Paulsm4 reply to the dup thread copied here for reference:
>One of the differences between Linux and other *nix is that Linux processes *are* >Linux threads:
>But that's academic.
>The reason you're crashing has little/nothing to do with threads, and everything >to do with the fact that you've allocated 250,000 element arrays as local >variables (off the stack).
>As I recommended in my other post: I believe that's unwise. I would strongly urge >you to allocate large objects from the heap, not the stack.
>IMHO .. PSM
That should have been a small 's', I was not indicating to increase the stack size only to query it.
Sorry, that's not what I meant. I wasn't quite sure where stack expansion would take place were it to near capacity. I guess I should read up on it; it isn't something I ever really contemplated before.