LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 04-22-2009, 02:15 PM   #1
rushmanw
LQ Newbie
 
Registered: Apr 2009
Posts: 6

Rep: Reputation: 0
Question pthread segmentation fault 64-bit only


I have been converting a statistical analysis utility in C from single-threaded design to multi-thread using POSIX threads (pthread). It ran fine on a 32-bit Fedora (Intel dual core), and gave a 30% improvement over the single-thread version. When I moved to a new quad-core, 64-bit machine (64-bit Fedora), it fails with a segmentation fault.

Other forums pointed out issues with pthread on 64-bit machines, and recommended a fix of adding -ldl to the compile line, but this did not help. It is compiled with -Wall with no warnings. I believe the code to be thread safe, and it runs fine on a 32-bit machine.

Why not use fork? This utility spawns 480,000 threads over its execution, and all threads read (never write) the same shared 1 GB dataset (initialized via file before entering the multithreaded tasks). pthread has the advantage of not making a copy of the heap space, so each thread can access the data.

Lastly, the fault occurs before any pthreads have been created, and all output (stdout and stderr) is fflush'd to find the exact spot where the fault occurs. It fails at the first declaration of an integer of around 250,000 elements. If I drop the array to 200,000 in a sample function, it does not fault. At 300,000 it always fails. In this snippet:
bool testFunc(int numberThreads, GlobalParams *globals)
{
printf("In testFunc A\n");fflush(stdout);fflush(stderr);
int a[300000];
return true;
}
The printf produces no output, and it faults. If I change the array size to 200000, printf works and there is no fault.

By the way, the original single-threaded utility ran 17 days for a single run. I am hoping this i7 920 can do the job in a few days with 8 threads, but I am stuck.
 
Old 04-22-2009, 02:49 PM   #2
ta0kira
Senior Member
 
Registered: Sep 2004
Distribution: FreeBSD 9.1, Kubuntu 12.10
Posts: 3,078

Rep: Reputation: Disabled
You shouldn't put that much data on the stack. Because you're on a 64-bit machine, int is 8 bytes; twice the size of one on a 32-bit machine. I assume you're getting a stack overflow. Try making it static or using malloc. You also need to run a backtrace (e.g. in gdb) to find out the exact line it happens (might be somewhere else.)
Kevin Barry

Last edited by ta0kira; 04-22-2009 at 02:51 PM.
 
Old 04-22-2009, 03:22 PM   #3
paulsm4
Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Hi -

1. I totally agree with ta0kira: stack overflow could easily be the root cause.

2. For whatever it's worth, I *strongly* recommend "fprintf (stderr, ...)" over "printf (); fflush (stdout)". In my experience, the latter simply doesn't work most of the time... (or, more precisely, "doesn't work the way you expect it when you most need it to"!)

IMHO .. PSM
 
Old 04-22-2009, 04:15 PM   #4
rushmanw
LQ Newbie
 
Registered: Apr 2009
Posts: 6

Original Poster
Rep: Reputation: 0
Thanks, but I almost put a note saying "please don't tell me not to do this... I have to."
I have got away from doing programming lately, so maybe I have forgotten what the stack is.

I have to spawn 480,000 threads, and I did not make it clear that there is only one thread active per CPU. So there are never more than 8 additional threads, or 9 counting the supervisor. Of course, I would never spawn 480,000 threads at once, but I failed to state that. Does that change the stack recommendation, or am I missing the point? The idea is to work the long list 8 at a time until complete, one thread per CPU, roughly. I did not use CPU affinity, BTW.

Also, I just added ifdef and macros so I can switch off multitasking while using the same functions for the work, and it runs fine that way. I have pthread_create reference a wrapper for the function that does the work, so I can call it either via pthread_create or directly (switchable by macro). The structures are complex, so I wanted to validate all the pointers and malloc stuff before adding threads.

I will change all the printf. It is really better code to do it that way, and I was being sloppy. It allows far better output control, too. Thanks for the fast response!
 
Old 04-22-2009, 04:16 PM   #5
dmail
Member
 
Registered: Oct 2005
Posts: 970

Rep: Reputation: Disabled
Quote:
Originally Posted by ta0kira View Post
You shouldn't put that much data on the stack. Because you're on a 64-bit machine, int is 8 bytes; twice the size of one on a 32-bit machine.
Doesn't Fedora use the LP64 model? I would think it does which means int is 4 bytes.

What does ulimit -S report?
 
Old 04-22-2009, 04:21 PM   #6
rushmanw
LQ Newbie
 
Registered: Apr 2009
Posts: 6

Original Poster
Rep: Reputation: 0
dmail, int shows as 4, long as 8, yes. Either way, the stack would be a concern if I spawned 480,000 at once. Looking to see if only 8 threads at a time is still a concern.

Last edited by rushmanw; 04-22-2009 at 04:22 PM. Reason: left out threads
 
Old 04-22-2009, 04:28 PM   #7
paulsm4
Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Quote:
dmail, int shows as 4, long as 8, yes. Either way, the stack would be a concern if I spawned 480,000 at once. Looking to see if only 8 threads at a time is still a concern.
Declaring large objects as local variables is *always* a concern. Even if you can get away with it today ... you're still leaving a potential landmine that somebody's going to have to debug tomorrow.

My advice is "Just don't do it".

IMHO .. PSM
 
Old 04-22-2009, 04:41 PM   #8
rushmanw
LQ Newbie
 
Registered: Apr 2009
Posts: 6

Original Poster
Rep: Reputation: 0
This is pretty much one-time use to create a datafile needed for another program, but I can convert the large arrays to pointers. I don't declare anything large inside the threads, just for managing them. I can start changing to pure pointers and malloc, and post back with whether it fixes the problem. Still not sure why it worked so well on the 32-bit OS. I could just give up and let it run on the 32, but waiting 17 days per iteration really slows down our progress.
 
Old 04-22-2009, 05:03 PM   #9
rushmanw
LQ Newbie
 
Registered: Apr 2009
Posts: 6

Original Poster
Rep: Reputation: 0
(sorry for the confusion because I refreshed and created a dup question)
Paulsm4: I moved the large array declarations to global per your suggestion, and that fixed it, at least for my limited test runs. Now I need to inspect the rest of the code for places that may cause a similar problem later. Because of the design, it was much easier to move the two large arrays to global than to change to pointers/malloc. Getting sloppy in my old age.

THANK YOU!

Paulsm4 reply to the dup thread copied here for reference:
>One of the differences between Linux and other *nix is that Linux processes *are* >Linux threads:
>
>http://www.linuxjournal.com/article/3814
>
>But that's academic.
>
>The reason you're crashing has little/nothing to do with threads, and everything >to do with the fact that you've allocated 250,000 element arrays as local >variables (off the stack).
>
>As I recommended in my other post: I believe that's unwise. I would strongly urge >you to allocate large objects from the heap, not the stack.
>
>IMHO .. PSM

Last edited by rushmanw; 04-22-2009 at 05:10 PM.
 
Old 04-22-2009, 11:45 PM   #10
ta0kira
Senior Member
 
Registered: Sep 2004
Distribution: FreeBSD 9.1, Kubuntu 12.10
Posts: 3,078

Rep: Reputation: Disabled
Quote:
Originally Posted by rushmanw View Post
dmail, int shows as 4, long as 8, yes. Either way, the stack would be a concern if I spawned 480,000 at once. Looking to see if only 8 threads at a time is still a concern.
Each thread should have its own stack, but they're dynamically allocated with the thread. It might have been better to say you're overrunning a stack rather than the stack.
Kevin Barry
 
Old 04-22-2009, 11:49 PM   #11
ta0kira
Senior Member
 
Registered: Sep 2004
Distribution: FreeBSD 9.1, Kubuntu 12.10
Posts: 3,078

Rep: Reputation: Disabled
Quote:
Originally Posted by dmail View Post
Doesn't Fedora use the LP64 model? I would think it does which means int is 4 bytes.

What does ulimit -S report?
Who expands the stack?
Kevin Barry
 
Old 04-23-2009, 03:39 AM   #12
dmail
Member
 
Registered: Oct 2005
Posts: 970

Rep: Reputation: Disabled
Quote:
Originally Posted by ta0kira View Post
Who expands the stack?
Kevin Barry
That should have been a small 's', I was not indicating to increase the stack size only to query it.
 
Old 04-23-2009, 02:02 PM   #13
ta0kira
Senior Member
 
Registered: Sep 2004
Distribution: FreeBSD 9.1, Kubuntu 12.10
Posts: 3,078

Rep: Reputation: Disabled
Quote:
Originally Posted by dmail View Post
That should have been a small 's', I was not indicating to increase the stack size only to query it.
Sorry, that's not what I meant. I wasn't quite sure where stack expansion would take place were it to near capacity. I guess I should read up on it; it isn't something I ever really contemplated before.
Kevin Barry
 
  


Reply

Tags
pthreads, segmentation fault


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Debian Lenny: tcsh segmentation fault. when .history contains 8-bit char (?) kaz2100 Debian 1 01-17-2010 10:14 PM
Segmentation fault cmplet-noobie Programming 3 04-03-2006 02:52 AM
yast segmentation fault, system freezing - nvidia driver at fault? BaltikaTroika Suse/Novell 2 12-02-2005 09:34 AM
Segmentation Fault ceenu99 Linux - Software 0 07-20-2005 05:42 AM
C Segmentation Fault fatman Programming 20 04-02-2003 05:16 PM


All times are GMT -5. The time now is 08:43 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration