Weird SIGALRM segmentation fault on 64bit linux...
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Location: Geneva - Switzerland ( Bordeaux - France / Montreal - QC - Canada)
Distribution: Slackware 14.2 - 32/64bit
Posts: 609
Rep:
Weird SIGALRM segmentation fault on 64bit linux...
Hi,
I've got a weird segmentation fault on the ending 'retq' instruction of my alarm callback, as if "calling pointer size" mismatched the 'q' of the 64bit retq. I've been trying to understand this bug for a while and couldn't get a clue.
This code worked well on 32bit Slackware/Ubuntu/Debian.
It now crashes on my 64bit Slackware install.
I've written a small test case script for those who want to try it :
Code:
#!/bin/sh
#test-sigalrm-pack.sh
# 64bit sigalrm segmentation fault test case package...
echo " * Generating source..."
cat >tst-sigalrm.cpp <<TESTSRC
//tst-sigalrm.cpp
#include <stdio.h>
#include <unistd.h>
#include <wait.h>
#include <sys/time.h>
typedef void* pvoid;
namespace{
volatile unsigned int alarmed =0;
struct sigaction action,oldAction;
void _onAlarmSignal(int signal,siginfo_t* sigInfo,pvoid pUContext) {
printf("Tick !\n");
++alarmed;
}
void _registerSignal() {
action.sa_flags =SA_SIGINFO;
action.sa_sigaction =_onAlarmSignal;
action.sa_restorer =NULL;
sigemptyset(&action.sa_mask);
sigaction(SIGALRM,&action,&oldAction);
}
void _startTimer() {
itimerval value;
value.it_interval.tv_sec =0;
value.it_interval.tv_usec =100;
value.it_value =value.it_interval;
setitimer(ITIMER_REAL,&value,NULL);
}
}
int main(int argc,const char **argv) {
_registerSignal();
_startTimer();
do ; while(alarmed<10);
return 0;
}
TESTSRC
echo " * Generating build script..."
cat >tst-sigalrm-build <<TESTBUILD
#!/bin/sh
#Custom build of the sigalrm test case:
echo " * Build source..."
cc -c -o "tst-sigalrm.o" -fpermissive -g3 -ggdb -w -D _DEBUG "tst-sigalrm.cpp"
#Custom link the test case:
#
# In order to link I need first to make this link on my system,
# this is because most distros just forget about static link.
# If anybody has a better idea for this :)... (Something that could
# work on any distro without 'hacking' the install...)
#
# /usr/lib64/gcclib -> gcc/x86_64-slackware-linux/4.4.3
#
#
echo " * Linking..."
ld -static -L "/usr/lib64/" -o "tst-sigalrm" \\
/usr/lib64/crt1.o /usr/lib64/crti.o \\
/usr/lib64/gcclib/crtbegin.o \\
"tst-sigalrm.o" \\
-L/usr/lib64/gcclib \\
-\\( -lgcc -lstdc++ -lgcc_eh -lm -lc -\\) \\
/usr/lib64/gcclib/crtend.o \\
/usr/lib64/crtn.o
TESTBUILD
chmod a+x "tst-sigalrm-build"
Paste this script into a file (like "test-sigalrm-pack.sh") and execute it ( $ sh test-sigalrm-pack.sh ), it will generate a cpp file (the source) and another script file that use the kind of link I need (static link) in the current directory.
Also in order to link you might want to create a symbolic link to you glibc files (see note in the build script), I don't know how to do that "universally" (without the symbolic link 'hack'), ideas would be greatly appreciated ! :)
~/tmp/test-sig-alarm$ uname -sm
Linux x86_64
~/tmp/test-sig-alarm$ gcc --version
gcc (GCC) 4.2.4 (Ubuntu 4.2.4-1ubuntu4)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
~/tmp/test-sig-alarm$ cat /etc/issue
Ubuntu 8.04.3 LTS \n \l
~/tmp/test-sig-alarm$ ./tst-sigalrm
Tick !
Tick !
Tick !
Tick !
Tick !
Tick !
Tick !
Tick !
Tick !
Tick !
~/tmp/test-sig-alarm$
Location: Geneva - Switzerland ( Bordeaux - France / Montreal - QC - Canada)
Distribution: Slackware 14.2 - 32/64bit
Posts: 609
Original Poster
Rep:
Quote:
Originally Posted by ntubski
I get no crash here:
...
You can link with just
Code:
g++ -static tst-sigalrm.o -o tst-sigarm
Hi thank you for your quick reply.
I know about linking, but "the real situation" uses a build system that generates makefile from project definitions, so I extracted the 'link line' from the generated makefile. If it comes from the link, I need to know why so I can fix the build system. I have separated compilation phases. I'm not using this code for *that* useless program of course :). What I mean is that I need a separate "ld" pass.
Cheers
Garry.
Last edited by NoStressHQ; 04-17-2010 at 05:27 PM.
Reason: (Precision about 'ld')
I've got a weird segmentation fault on the ending 'retq' instruction of my alarm callback, as if "calling pointer size" mismatched the 'q' of the 64bit retq. I've been trying to understand this bug for a while and couldn't get a clue.
This code worked well on 32bit Slackware/Ubuntu/Debian.
It now crashes on my 64bit Slackware install.
I've written a small test case script for those who want to try it :
Code:
#!/bin/sh
#test-sigalrm-pack.sh
# 64bit sigalrm segmentation fault test case package...
echo " * Generating source..."
cat >tst-sigalrm.cpp <<TESTSRC
//tst-sigalrm.cpp
#include <stdio.h>
#include <unistd.h>
#include <wait.h>
#include <sys/time.h>
typedef void* pvoid;
namespace{
volatile unsigned int alarmed =0;
struct sigaction action,oldAction;
void _onAlarmSignal(int signal,siginfo_t* sigInfo,pvoid pUContext) {
printf("Tick !\n");
++alarmed;
}
void _registerSignal() {
action.sa_flags =SA_SIGINFO;
action.sa_sigaction =_onAlarmSignal;
action.sa_restorer =NULL;
sigemptyset(&action.sa_mask);
sigaction(SIGALRM,&action,&oldAction);
}
void _startTimer() {
itimerval value;
value.it_interval.tv_sec =0;
value.it_interval.tv_usec =100;
value.it_value =value.it_interval;
setitimer(ITIMER_REAL,&value,NULL);
}
}
int main(int argc,const char **argv) {
_registerSignal();
_startTimer();
do ; while(alarmed<10);
return 0;
}
TESTSRC
echo " * Generating build script..."
cat >tst-sigalrm-build <<TESTBUILD
#!/bin/sh
#Custom build of the sigalrm test case:
echo " * Build source..."
cc -c -o "tst-sigalrm.o" -fpermissive -g3 -ggdb -w -D _DEBUG "tst-sigalrm.cpp"
#Custom link the test case:
#
# In order to link I need first to make this link on my system,
# this is because most distros just forget about static link.
# If anybody has a better idea for this :)... (Something that could
# work on any distro without 'hacking' the install...)
#
# /usr/lib64/gcclib -> gcc/x86_64-slackware-linux/4.4.3
#
#
echo " * Linking..."
ld -static -L "/usr/lib64/" -o "tst-sigalrm" \\
/usr/lib64/crt1.o /usr/lib64/crti.o \\
/usr/lib64/gcclib/crtbegin.o \\
"tst-sigalrm.o" \\
-L/usr/lib64/gcclib \\
-\\( -lgcc -lstdc++ -lgcc_eh -lm -lc -\\) \\
/usr/lib64/gcclib/crtend.o \\
/usr/lib64/crtn.o
TESTBUILD
chmod a+x "tst-sigalrm-build"
Paste this script into a file (like "test-sigalrm-pack.sh") and execute it ( $ sh test-sigalrm-pack.sh ), it will generate a cpp file (the source) and another script file that use the kind of link I need (static link) in the current directory.
Also in order to link you might want to create a symbolic link to you glibc files (see note in the build script), I don't know how to do that "universally" (without the symbolic link 'hack'), ideas would be greatly appreciated !
I was going to say that the g++ command I posted just calls ld with the correct arguments, but actually it calls collect2 which then calls ld. Anyway, the test program doesn't crash whichever way I link it.
Location: Geneva - Switzerland ( Bordeaux - France / Montreal - QC - Canada)
Distribution: Slackware 14.2 - 32/64bit
Posts: 609
Original Poster
Rep:
Thanks
Hey,
Thank you all for taking time to test. Sorry I was busy on another project and couldn't check sooner.
First of all, of course, anybody should know that stderr stuff and warnings are irrelevant to the problem.
For those who tried to understand and test, thank you, it's true I can reproduce the problem with a much simpler compiling line. In fact I first suspected the build system I used to link with the wrong crts/gcc libs, but trying with the simple "g++" command, I found that it worked well with shared linking (no special option) and still crashes when in static (-static) so I updated the test case...
(If you don't want the whole script and still got the source somewhere you can just try these:
Location: Geneva - Switzerland ( Bordeaux - France / Montreal - QC - Canada)
Distribution: Slackware 14.2 - 32/64bit
Posts: 609
Original Poster
Rep:
Forum newbie != programming newbie :)
My points were:
1- I explained it crashes on the 'retq' of the callback even with an empty callback. I traced it instruction by instruction, watching the stack and everything...
2- This is a simple test case and stderr or stdout are both buffered and just don't use the same channel. And anyway it's just to show something... Again, it crashes even with an empty function...
3- See point 1.
Sorry if I 'sounded' rude, it's just that the question, as I understand it, is "far away" from your answer which seems to be intended to a programming student. No offense to students of course , and no offense to you, I had the "I'm not a noob" reflex.
My points were:
1- I explained it crashes on the 'retq' of the callback even with an empty callback. I traced it instruction by instruction, watching the stack and everything...
2- This is a simple test case and stderr or stdout are both buffered and just don't use the same channel. And anyway it's just to show something... Again, it crashes even with an empty function...
3- See point 1.
Sorry if I 'sounded' rude, it's just that the question, as I understand it, is "far away" from your answer which seems to be intended to a programming student. No offense to students of course , and no offense to you, I had the "I'm not a noob" reflex.
Location: Geneva - Switzerland ( Bordeaux - France / Montreal - QC - Canada)
Distribution: Slackware 14.2 - 32/64bit
Posts: 609
Original Poster
Rep:
Quote:
Originally Posted by Sergei Steshenko
What about '-Wall -Wextra' ?
It's about "in real life" I use "warning as error" and "warning max level 4 whatever the compiler"...
I asked a precise question, "implicitly" explaining that I traced with debugger... (Usage of 'retq' implies you understand 'a bit' how a CPU works, an OS works, a compiler works, a debugger works)...
I wrote just a "test case" to show a sample of the crash... Then explain me how a "warning all" could change anything to an empty function... Again, explain me why it works in shared linkage and not in static linkage...
My point is simply :
"Don't try to correct what you think is wrong to the one who ask a question, just answer his question".
You never know who asks or 'what did' and 'what knows' that person, so don't take him for a noob... And I think if you understood better how a compiler works, assembly language, and how to use a debugger, you wouldn't even talked about "warnings".
Sorry, again, I've spend severals days (and maybe weeks) on tracking this so the pedantic "have you tried warning all" is irritating me...
And also, thank you again for those who took time to test, and answer to the questions. I didn't meant to start any debate here .
According to the standard, you're really not allowed to do much in a signal handler. All you are guaranteed to be able to do in the signal-handling function, without causing undefined behavior, is to call signal, and to assign a value to a volatile static object of type the type sig_atomic_t.
It's about "in real life" I use "warning as error" and "warning max level 4 whatever the compiler"...
I asked a precise question, "implicitly" explaining that I traced with debugger... (Usage of 'retq' implies you understand 'a bit' how a CPU works, an OS works, a compiler works, a debugger works)...
I wrote just a "test case" to show a sample of the crash... Then explain me how a "warning all" could change anything to an empty function... Again, explain me why it works in shared linkage and not in static linkage...
My point is simply :
"Don't try to correct what you think is wrong to the one who ask a question, just answer his question".
You never know who asks or 'what did' and 'what knows' that person, so don't take him for a noob... And I think if you understood better how a compiler works, assembly language, and how to use a debugger, you wouldn't even talked about "warnings".
Sorry, again, I've spend severals days (and maybe weeks) on tracking this so the pedantic "have you tried warning all" is irritating me...
And also, thank you again for those who took time to test, and answer to the questions. I didn't meant to start any debate here .
"Peace"
Garry.
You might be missing a number of points. For example, I know that with each new release 'gcc' is getting more and more stringent WRT language compliance. So, nobody needs to guess, it's better the compiler always produces all the warnings it can. I.e. somebody else trying your example with the newest compiler might see a warning you do not have.
About answering question - often answering a question with a question is a good answer.
Location: Geneva - Switzerland ( Bordeaux - France / Montreal - QC - Canada)
Distribution: Slackware 14.2 - 32/64bit
Posts: 609
Original Poster
Rep:
Quote:
Originally Posted by Sergei Steshenko
*printf is too much to my taste.
Sorry but I still don't see the 'relevance'...
An empty function is 'too much' ?
If it crashes on your machine -> removes the printf you'll see it'll still crash...
If it doesn't crashes on your machine, you can only report me to know "which system" so I can pinpoint the guilty part of this bug.
I had this code work inside a framework of 30+ projects for more than a year on slackware 32...
This code compiles and works on slackware 64 in SHARED linkage...
This code compiles and crashes on slackware 64 in STATIC linkage (at the very specific time of 'poping' the return adress of the kernel caller (?) )... So I suspect a 32/64 bit mismatch... Nothing related to 'race condition' 'timeout' or 'warning'...
Of course when I mean this code i mean this way to use sigalrm... I don't printf into my 'real life' callback...
Again I'm not asking for programming courses... I see the bug, I just want to find why it happen and how to fix it.
Btw, thank you for taking time to answer.
Cheers
Last edited by NoStressHQ; 04-23-2010 at 06:28 PM.
Reason: (Changed a wrong slackware 32 for slackware 64 :) )
If it crashes on your machine -> removes the printf you'll see it'll still crash...
If it doesn't crashes on your machine, you can only report me to know "which system" so I can pinpoint the guilty part of this bug.
I had this code work inside a framework of 30+ projects for more than a year on slackware 32...
This code compiles and works on slackware 64 in SHARED linkage...
This code compiles and crashes on slackware 64 in STATIC linkage (at the very specific time of 'poping' the return adress of the kernel caller (?) )... So I suspect a 32/64 bit mismatch... Nothing related to 'race condition' 'timeout' or 'warning'...
Of course when I mean this code i mean this way to use sigalrm... I don't printf into my 'real life' callback...
Again I'm not asking for programming courses... I see the bug, I just want to find why it happen and how to fix it.
Btw, thank you for taking time to answer.
Cheers
So you are advertising yourself as not a newbie.
I went through previous posts in this thread and I do not see the following info:
Your OS version (just name);
Your 'gcc' version;
Your 'glibc' version;
Your 'binutils' version.
Meanwhile just performing WEB search/browsing I see some 'retq' related bugs. So, maybe your combination of OS + 'gcc' + glibc' + 'binutils' versions is affected by such a bug.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.