LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 04-17-2010, 05:20 PM   #1
NoStressHQ
Member
 
Registered: Apr 2010
Location: Lausanne - Switzerland ( Bordeaux - France / Montreal - QC - Canada)
Distribution: Slackware Leet - 32/64bit
Posts: 317

Rep: Reputation: 109Reputation: 109
Weird SIGALRM segmentation fault on 64bit linux...


Hi,

I've got a weird segmentation fault on the ending 'retq' instruction of my alarm callback, as if "calling pointer size" mismatched the 'q' of the 64bit retq. I've been trying to understand this bug for a while and couldn't get a clue.
This code worked well on 32bit Slackware/Ubuntu/Debian.
It now crashes on my 64bit Slackware install.

I've written a small test case script for those who want to try it :

Code:
#!/bin/sh
#test-sigalrm-pack.sh
# 64bit sigalrm segmentation fault test case package...

echo " * Generating source..."
cat	>tst-sigalrm.cpp	<<TESTSRC
//tst-sigalrm.cpp
#include <stdio.h>
#include <unistd.h>
#include <wait.h>
#include <sys/time.h>

typedef	void*	pvoid;

namespace{
	volatile	unsigned	int	alarmed	=0;
	struct	sigaction	action,oldAction;

	void	_onAlarmSignal(int	signal,siginfo_t* sigInfo,pvoid pUContext) {
		printf("Tick !\n");
		++alarmed;
	}

	void	_registerSignal() {
		action.sa_flags		=SA_SIGINFO;
		action.sa_sigaction	=_onAlarmSignal;
		action.sa_restorer	=NULL;
		sigemptyset(&action.sa_mask);

		sigaction(SIGALRM,&action,&oldAction);
	}

	void	_startTimer() {
		itimerval	value;
		value.it_interval.tv_sec	=0;
		value.it_interval.tv_usec	=100;
		value.it_value	=value.it_interval;
		setitimer(ITIMER_REAL,&value,NULL);
	}
}

int	main(int argc,const char **argv) {

	_registerSignal();
	_startTimer();

	do	;	while(alarmed<10);

	return	0;
}
TESTSRC

echo " * Generating build script..."
cat	>tst-sigalrm-build	<<TESTBUILD
#!/bin/sh

#Custom build of the sigalrm test case:
echo " * Build source..."
cc -c -o "tst-sigalrm.o" -fpermissive -g3 -ggdb -w -D _DEBUG "tst-sigalrm.cpp"

#Custom link the test case:
#
#	In order to link I need first to make this link on my system,
#	this is because most distros just forget about static link.
#	If anybody has a better idea for this :)... (Something that could
#	work on any distro without 'hacking' the install...)
#
#	/usr/lib64/gcclib -> gcc/x86_64-slackware-linux/4.4.3
#
#
echo " * Linking..."
ld -static -L "/usr/lib64/" -o "tst-sigalrm" \\
	/usr/lib64/crt1.o /usr/lib64/crti.o \\
	/usr/lib64/gcclib/crtbegin.o  \\
	"tst-sigalrm.o" \\
	-L/usr/lib64/gcclib \\
	-\\( -lgcc -lstdc++ -lgcc_eh -lm -lc -\\) \\
	/usr/lib64/gcclib/crtend.o \\
	/usr/lib64/crtn.o

TESTBUILD
chmod a+x "tst-sigalrm-build"
Paste this script into a file (like "test-sigalrm-pack.sh") and execute it ( $ sh test-sigalrm-pack.sh ), it will generate a cpp file (the source) and another script file that use the kind of link I need (static link) in the current directory.
Also in order to link you might want to create a symbolic link to you glibc files (see note in the build script), I don't know how to do that "universally" (without the symbolic link 'hack'), ideas would be greatly appreciated ! :)

Thank you.

Garry.
 
Old 04-17-2010, 06:17 PM   #2
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 2,541

Rep: Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878
I get no crash here:
Code:
~/tmp/test-sig-alarm$ uname -sm
Linux x86_64
~/tmp/test-sig-alarm$ gcc --version
gcc (GCC) 4.2.4 (Ubuntu 4.2.4-1ubuntu4)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

~/tmp/test-sig-alarm$ cat /etc/issue
Ubuntu 8.04.3 LTS \n \l

~/tmp/test-sig-alarm$ ./tst-sigalrm
Tick !
Tick !
Tick !
Tick !
Tick !
Tick !
Tick !
Tick !
Tick !
Tick !
~/tmp/test-sig-alarm$
You can link with just
Code:
g++ -static tst-sigalrm.o -o tst-sigarm
 
Old 04-17-2010, 06:25 PM   #3
NoStressHQ
Member
 
Registered: Apr 2010
Location: Lausanne - Switzerland ( Bordeaux - France / Montreal - QC - Canada)
Distribution: Slackware Leet - 32/64bit
Posts: 317

Original Poster
Rep: Reputation: 109Reputation: 109
Quote:
Originally Posted by ntubski View Post
I get no crash here:
...
You can link with just
Code:
g++ -static tst-sigalrm.o -o tst-sigarm
Hi thank you for your quick reply.
I know about linking, but "the real situation" uses a build system that generates makefile from project definitions, so I extracted the 'link line' from the generated makefile. If it comes from the link, I need to know why so I can fix the build system. I have separated compilation phases. I'm not using this code for *that* useless program of course :). What I mean is that I need a separate "ld" pass.

Cheers

Garry.

Last edited by NoStressHQ; 04-17-2010 at 06:27 PM. Reason: (Precision about 'ld')
 
Old 04-17-2010, 06:44 PM   #4
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by NoStressHQ View Post
Hi,

I've got a weird segmentation fault on the ending 'retq' instruction of my alarm callback, as if "calling pointer size" mismatched the 'q' of the 64bit retq. I've been trying to understand this bug for a while and couldn't get a clue.
This code worked well on 32bit Slackware/Ubuntu/Debian.
It now crashes on my 64bit Slackware install.

I've written a small test case script for those who want to try it :

Code:
#!/bin/sh
#test-sigalrm-pack.sh
# 64bit sigalrm segmentation fault test case package...

echo " * Generating source..."
cat	>tst-sigalrm.cpp	<<TESTSRC
//tst-sigalrm.cpp
#include <stdio.h>
#include <unistd.h>
#include <wait.h>
#include <sys/time.h>

typedef	void*	pvoid;

namespace{
	volatile	unsigned	int	alarmed	=0;
	struct	sigaction	action,oldAction;

	void	_onAlarmSignal(int	signal,siginfo_t* sigInfo,pvoid pUContext) {
		printf("Tick !\n");
		++alarmed;
	}

	void	_registerSignal() {
		action.sa_flags		=SA_SIGINFO;
		action.sa_sigaction	=_onAlarmSignal;
		action.sa_restorer	=NULL;
		sigemptyset(&action.sa_mask);

		sigaction(SIGALRM,&action,&oldAction);
	}

	void	_startTimer() {
		itimerval	value;
		value.it_interval.tv_sec	=0;
		value.it_interval.tv_usec	=100;
		value.it_value	=value.it_interval;
		setitimer(ITIMER_REAL,&value,NULL);
	}
}

int	main(int argc,const char **argv) {

	_registerSignal();
	_startTimer();

	do	;	while(alarmed<10);

	return	0;
}
TESTSRC

echo " * Generating build script..."
cat	>tst-sigalrm-build	<<TESTBUILD
#!/bin/sh

#Custom build of the sigalrm test case:
echo " * Build source..."
cc -c -o "tst-sigalrm.o" -fpermissive -g3 -ggdb -w -D _DEBUG "tst-sigalrm.cpp"

#Custom link the test case:
#
#	In order to link I need first to make this link on my system,
#	this is because most distros just forget about static link.
#	If anybody has a better idea for this :)... (Something that could
#	work on any distro without 'hacking' the install...)
#
#	/usr/lib64/gcclib -> gcc/x86_64-slackware-linux/4.4.3
#
#
echo " * Linking..."
ld -static -L "/usr/lib64/" -o "tst-sigalrm" \\
	/usr/lib64/crt1.o /usr/lib64/crti.o \\
	/usr/lib64/gcclib/crtbegin.o  \\
	"tst-sigalrm.o" \\
	-L/usr/lib64/gcclib \\
	-\\( -lgcc -lstdc++ -lgcc_eh -lm -lc -\\) \\
	/usr/lib64/gcclib/crtend.o \\
	/usr/lib64/crtn.o

TESTBUILD
chmod a+x "tst-sigalrm-build"
Paste this script into a file (like "test-sigalrm-pack.sh") and execute it ( $ sh test-sigalrm-pack.sh ), it will generate a cpp file (the source) and another script file that use the kind of link I need (static link) in the current directory.
Also in order to link you might want to create a symbolic link to you glibc files (see note in the build script), I don't know how to do that "universally" (without the symbolic link 'hack'), ideas would be greatly appreciated !

Thank you.

Garry.
I would first suggest to replace

Code:
printf("Tick !\n");
with

Code:
fprintf(stderr, "Tick !\n");
in order to avoid any stdout buffering.


Also, add to compilation line '-Wall -Wextra'.
 
Old 04-17-2010, 07:51 PM   #5
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 2,541

Rep: Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878
Quote:
Originally Posted by NoStressHQ View Post
What I mean is that I need a separate "ld" pass.
I was going to say that the g++ command I posted just calls ld with the correct arguments, but actually it calls collect2 which then calls ld. Anyway, the test program doesn't crash whichever way I link it.
 
Old 04-23-2010, 05:48 PM   #6
NoStressHQ
Member
 
Registered: Apr 2010
Location: Lausanne - Switzerland ( Bordeaux - France / Montreal - QC - Canada)
Distribution: Slackware Leet - 32/64bit
Posts: 317

Original Poster
Rep: Reputation: 109Reputation: 109
Thanks

Hey,

Thank you all for taking time to test. Sorry I was busy on another project and couldn't check sooner.

First of all, of course, anybody should know that stderr stuff and warnings are irrelevant to the problem.

For those who tried to understand and test, thank you, it's true I can reproduce the problem with a much simpler compiling line. In fact I first suspected the build system I used to link with the wrong crts/gcc libs, but trying with the simple "g++" command, I found that it worked well with shared linking (no special option) and still crashes when in static (-static) so I updated the test case...

(If you don't want the whole script and still got the source somewhere you can just try these:
Code:
g++ tst-sigalrm.cpp -o tst-sigalrm-shared
g++ -static tst-sigalrm.cpp -o tst-sigalrm-static
test-sigalrm-pack2.sh:
Code:
#!/bin/sh
#test-sigalrm-pack2.sh
# 64bit sigalrm segmentation fault test case package...

echo " * Generating source..."
cat	>tst-sigalrm.cpp	<<TESTSRC
//tst-sigalrm.cpp
#include <stdio.h>
#include <unistd.h>
#include <wait.h>
#include <sys/time.h>

typedef	void*	pvoid;

namespace{
	volatile	unsigned	int	alarmed	=0;
	struct	sigaction	action,oldAction;

	void	_onAlarmSignal(int	signal,siginfo_t* sigInfo,pvoid pUContext) {
		printf("Tick !\n");
		++alarmed;
	}

	void	_registerSignal() {
		action.sa_flags		=SA_SIGINFO;
		action.sa_sigaction	=_onAlarmSignal;
		action.sa_restorer	=NULL;
		sigemptyset(&action.sa_mask);

		sigaction(SIGALRM,&action,&oldAction);
	}

	void	_startTimer() {
		itimerval	value;
		value.it_interval.tv_sec	=0;
		value.it_interval.tv_usec	=100;
		value.it_value	=value.it_interval;
		setitimer(ITIMER_REAL,&value,NULL);
	}
}

int	main(int argc,const char **argv) {

	_registerSignal();
	_startTimer();

	do	;	while(alarmed<10);

	return	0;
}
TESTSRC

echo " * Generating build script..."
cat	>tst-sigalrm-build	<<TESTBUILD
#!/bin/sh

#Builds of the sigalrm test case:
g++ tst-sigalrm.cpp -o tst-sigalrm-shared
g++ -static tst-sigalrm.cpp -o tst-sigalrm-static

echo " * Shared run :"
tst-sigalrm-shared
echo " * Static run :"
tst-sigalrm-static
TESTBUILD
chmod a+x "tst-sigalrm-build"
It just compile with the simple g++ command, using -static for one compilation.

So does anyone have a clue ? Might it be a problem with Slackware 64 only ? Some static library built with the wrong "arch" or something like this ?

Thanks

Garry.
 
Old 04-23-2010, 06:31 PM   #7
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by NoStressHQ View Post
Hey,

...
First of all, of course, anybody should know that stderr stuff and warnings are irrelevant to the problem.
...

Maybe.

My points are:
  1. it is not known exactly where the program crashes;
  2. output to stderr is unbuffered, so if it happens before the crash, there is a chance to see it;
  3. knowing the exact location of the last executed before the crash statement might help in debugging the problem.

I.e. using stderr rather than stdout for diagnostic output is SOP, and I see no reason to change it.

...

What about '-Wall -Wextra' ?
 
0 members found this post helpful.
Old 04-23-2010, 06:48 PM   #8
NoStressHQ
Member
 
Registered: Apr 2010
Location: Lausanne - Switzerland ( Bordeaux - France / Montreal - QC - Canada)
Distribution: Slackware Leet - 32/64bit
Posts: 317

Original Poster
Rep: Reputation: 109Reputation: 109
Forum newbie != programming newbie :)

My points were:
1- I explained it crashes on the 'retq' of the callback even with an empty callback. I traced it instruction by instruction, watching the stack and everything...
2- This is a simple test case and stderr or stdout are both buffered and just don't use the same channel. And anyway it's just to show something... Again, it crashes even with an empty function...
3- See point 1.

Sorry if I 'sounded' rude, it's just that the question, as I understand it, is "far away" from your answer which seems to be intended to a programming student. No offense to students of course , and no offense to you, I had the "I'm not a noob" reflex.

Learning is the path to follow...

Cheers.

Garry.
 
Old 04-23-2010, 06:56 PM   #9
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
del

Last edited by Sergei Steshenko; 04-23-2010 at 06:57 PM.
 
Old 04-23-2010, 06:58 PM   #10
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by NoStressHQ View Post
My points were:
1- I explained it crashes on the 'retq' of the callback even with an empty callback. I traced it instruction by instruction, watching the stack and everything...
2- This is a simple test case and stderr or stdout are both buffered and just don't use the same channel. And anyway it's just to show something... Again, it crashes even with an empty function...
3- See point 1.

Sorry if I 'sounded' rude, it's just that the question, as I understand it, is "far away" from your answer which seems to be intended to a programming student. No offense to students of course , and no offense to you, I had the "I'm not a noob" reflex.

Learning is the path to follow...

Cheers.

Garry.
What about '-Wall -Wextra' ?
 
Old 04-23-2010, 07:11 PM   #11
NoStressHQ
Member
 
Registered: Apr 2010
Location: Lausanne - Switzerland ( Bordeaux - France / Montreal - QC - Canada)
Distribution: Slackware Leet - 32/64bit
Posts: 317

Original Poster
Rep: Reputation: 109Reputation: 109
Quote:
Originally Posted by Sergei Steshenko View Post
What about '-Wall -Wextra' ?
It's about "in real life" I use "warning as error" and "warning max level 4 whatever the compiler"...
I asked a precise question, "implicitly" explaining that I traced with debugger... (Usage of 'retq' implies you understand 'a bit' how a CPU works, an OS works, a compiler works, a debugger works)...

I wrote just a "test case" to show a sample of the crash... Then explain me how a "warning all" could change anything to an empty function... Again, explain me why it works in shared linkage and not in static linkage...

My point is simply :
"Don't try to correct what you think is wrong to the one who ask a question, just answer his question".

You never know who asks or 'what did' and 'what knows' that person, so don't take him for a noob... And I think if you understood better how a compiler works, assembly language, and how to use a debugger, you wouldn't even talked about "warnings".

Sorry, again, I've spend severals days (and maybe weeks) on tracking this so the pedantic "have you tried warning all" is irritating me...

And also, thank you again for those who took time to test, and answer to the questions. I didn't meant to start any debate here .

"Peace"

Garry.
 
Old 04-23-2010, 07:14 PM   #12
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
I am not sure how valid the following words are:

http://stackoverflow.com/questions/1...in-the-handler :

Quote:
According to the standard, you're really not allowed to do much in a signal handler. All you are guaranteed to be able to do in the signal-handling function, without causing undefined behavior, is to call signal, and to assign a value to a volatile static object of type the type sig_atomic_t.
*printf is too much to my taste.
 
Old 04-23-2010, 07:20 PM   #13
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by NoStressHQ View Post
It's about "in real life" I use "warning as error" and "warning max level 4 whatever the compiler"...
I asked a precise question, "implicitly" explaining that I traced with debugger... (Usage of 'retq' implies you understand 'a bit' how a CPU works, an OS works, a compiler works, a debugger works)...

I wrote just a "test case" to show a sample of the crash... Then explain me how a "warning all" could change anything to an empty function... Again, explain me why it works in shared linkage and not in static linkage...

My point is simply :
"Don't try to correct what you think is wrong to the one who ask a question, just answer his question".

You never know who asks or 'what did' and 'what knows' that person, so don't take him for a noob... And I think if you understood better how a compiler works, assembly language, and how to use a debugger, you wouldn't even talked about "warnings".

Sorry, again, I've spend severals days (and maybe weeks) on tracking this so the pedantic "have you tried warning all" is irritating me...

And also, thank you again for those who took time to test, and answer to the questions. I didn't meant to start any debate here .

"Peace"

Garry.
You might be missing a number of points. For example, I know that with each new release 'gcc' is getting more and more stringent WRT language compliance. So, nobody needs to guess, it's better the compiler always produces all the warnings it can. I.e. somebody else trying your example with the newest compiler might see a warning you do not have.

About answering question - often answering a question with a question is a good answer.
 
Old 04-23-2010, 07:27 PM   #14
NoStressHQ
Member
 
Registered: Apr 2010
Location: Lausanne - Switzerland ( Bordeaux - France / Montreal - QC - Canada)
Distribution: Slackware Leet - 32/64bit
Posts: 317

Original Poster
Rep: Reputation: 109Reputation: 109
Quote:
Originally Posted by Sergei Steshenko View Post
*printf is too much to my taste.
Sorry but I still don't see the 'relevance'...

An empty function is 'too much' ?

If it crashes on your machine -> removes the printf you'll see it'll still crash...
If it doesn't crashes on your machine, you can only report me to know "which system" so I can pinpoint the guilty part of this bug.

I had this code work inside a framework of 30+ projects for more than a year on slackware 32...

This code compiles and works on slackware 64 in SHARED linkage...

This code compiles and crashes on slackware 64 in STATIC linkage (at the very specific time of 'poping' the return adress of the kernel caller (?) )... So I suspect a 32/64 bit mismatch... Nothing related to 'race condition' 'timeout' or 'warning'...

Of course when I mean this code i mean this way to use sigalrm... I don't printf into my 'real life' callback...

Again I'm not asking for programming courses... I see the bug, I just want to find why it happen and how to fix it.

Btw, thank you for taking time to answer.

Cheers

Last edited by NoStressHQ; 04-23-2010 at 07:28 PM. Reason: (Changed a wrong slackware 32 for slackware 64 :) )
 
Old 04-23-2010, 08:05 PM   #15
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by NoStressHQ View Post
Sorry but I still don't see the 'relevance'...

An empty function is 'too much' ?

If it crashes on your machine -> removes the printf you'll see it'll still crash...
If it doesn't crashes on your machine, you can only report me to know "which system" so I can pinpoint the guilty part of this bug.

I had this code work inside a framework of 30+ projects for more than a year on slackware 32...

This code compiles and works on slackware 64 in SHARED linkage...

This code compiles and crashes on slackware 64 in STATIC linkage (at the very specific time of 'poping' the return adress of the kernel caller (?) )... So I suspect a 32/64 bit mismatch... Nothing related to 'race condition' 'timeout' or 'warning'...

Of course when I mean this code i mean this way to use sigalrm... I don't printf into my 'real life' callback...

Again I'm not asking for programming courses... I see the bug, I just want to find why it happen and how to fix it.

Btw, thank you for taking time to answer.

Cheers
So you are advertising yourself as not a newbie.

I went through previous posts in this thread and I do not see the following info:
  1. Your OS version (just name);
  2. Your 'gcc' version;
  3. Your 'glibc' version;
  4. Your 'binutils' version.

Meanwhile just performing WEB search/browsing I see some 'retq' related bugs. So, maybe your combination of OS + 'gcc' + glibc' + 'binutils' versions is affected by such a bug.

An example of such a bug: http://sourceware.org/ml/binutils/2008-03/msg00111.html .

I.e. in order to resolve the issue I would try to use different (newer if available) versions of the above tools.

...

Why another thread:
http://www.linuxquestions.org/questi...broken-803845/ ?
 
  


Reply

Tags
c++, glibc, link, linux, static


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Weird problem, can't compile kernel: gcc: Internal error:Segmentation fault (prgm as) abefroman Linux - Software 4 08-01-2006 06:28 PM
Unreal 2004 64bit Segmentation fault phoenix49 Fedora 3 05-13-2006 01:05 AM
yast segmentation fault, system freezing - nvidia driver at fault? BaltikaTroika Suse/Novell 2 12-02-2005 10:34 AM
weird lilo segmentation fault demon666 Linux - Software 3 07-15-2004 02:27 PM
f-prot anti-virus "Segmentation fault" error and other weird stuff dalek Linux - Software 9 10-22-2003 08:37 PM


All times are GMT -5. The time now is 04:00 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration