Weird SIGALRM segmentation fault on 64bit linux...
Hi,
I've got a weird segmentation fault on the ending 'retq' instruction of my alarm callback, as if "calling pointer size" mismatched the 'q' of the 64bit retq. I've been trying to understand this bug for a while and couldn't get a clue. This code worked well on 32bit Slackware/Ubuntu/Debian. It now crashes on my 64bit Slackware install. I've written a small test case script for those who want to try it : Code:
#!/bin/sh Also in order to link you might want to create a symbolic link to you glibc files (see note in the build script), I don't know how to do that "universally" (without the symbolic link 'hack'), ideas would be greatly appreciated ! :) Thank you. Garry. |
I get no crash here:
Code:
~/tmp/test-sig-alarm$ uname -sm Code:
g++ -static tst-sigalrm.o -o tst-sigarm |
Quote:
I know about linking, but "the real situation" uses a build system that generates makefile from project definitions, so I extracted the 'link line' from the generated makefile. If it comes from the link, I need to know why so I can fix the build system. I have separated compilation phases. I'm not using this code for *that* useless program of course :). What I mean is that I need a separate "ld" pass. Cheers Garry. |
Quote:
Code:
printf("Tick !\n"); Code:
fprintf(stderr, "Tick !\n"); Also, add to compilation line '-Wall -Wextra'. |
Quote:
|
Thanks
Hey,
Thank you all for taking time to test. Sorry I was busy on another project and couldn't check sooner. First of all, of course, anybody should know that stderr stuff and warnings are irrelevant to the problem. For those who tried to understand and test, thank you, it's true I can reproduce the problem with a much simpler compiling line. In fact I first suspected the build system I used to link with the wrong crts/gcc libs, but trying with the simple "g++" command, I found that it worked well with shared linking (no special option) and still crashes when in static (-static) so I updated the test case... (If you don't want the whole script and still got the source somewhere you can just try these: Code:
g++ tst-sigalrm.cpp -o tst-sigalrm-shared Code:
#!/bin/sh So does anyone have a clue ? Might it be a problem with Slackware 64 only ? Some static library built with the wrong "arch" or something like this ? Thanks Garry. |
Quote:
Maybe. My points are:
I.e. using stderr rather than stdout for diagnostic output is SOP, and I see no reason to change it. ... What about '-Wall -Wextra' ? |
Forum newbie != programming newbie :)
My points were:
1- I explained it crashes on the 'retq' of the callback even with an empty callback. I traced it instruction by instruction, watching the stack and everything... 2- This is a simple test case and stderr or stdout are both buffered and just don't use the same channel. And anyway it's just to show something... Again, it crashes even with an empty function... 3- See point 1. Sorry if I 'sounded' rude, it's just that the question, as I understand it, is "far away" from your answer which seems to be intended to a programming student. No offense to students of course :), and no offense to you, I had the "I'm not a noob" reflex. Learning is the path to follow... Cheers. Garry. |
del
|
Quote:
|
Quote:
I asked a precise question, "implicitly" explaining that I traced with debugger... (Usage of 'retq' implies you understand 'a bit' how a CPU works, an OS works, a compiler works, a debugger works)... I wrote just a "test case" to show a sample of the crash... Then explain me how a "warning all" could change anything to an empty function... Again, explain me why it works in shared linkage and not in static linkage... My point is simply : "Don't try to correct what you think is wrong to the one who ask a question, just answer his question". You never know who asks or 'what did' and 'what knows' that person, so don't take him for a noob... And I think if you understood better how a compiler works, assembly language, and how to use a debugger, you wouldn't even talked about "warnings". Sorry, again, I've spend severals days (and maybe weeks) on tracking this so the pedantic "have you tried warning all" is irritating me... :) And also, thank you again for those who took time to test, and answer to the questions. I didn't meant to start any debate here :). "Peace" Garry. |
I am not sure how valid the following words are:
http://stackoverflow.com/questions/1...in-the-handler : Quote:
|
Quote:
About answering question - often answering a question with a question is a good answer. |
Quote:
An empty function is 'too much' ? If it crashes on your machine -> removes the printf you'll see it'll still crash... If it doesn't crashes on your machine, you can only report me to know "which system" so I can pinpoint the guilty part of this bug. I had this code work inside a framework of 30+ projects for more than a year on slackware 32... This code compiles and works on slackware 64 in SHARED linkage... This code compiles and crashes on slackware 64 in STATIC linkage (at the very specific time of 'poping' the return adress of the kernel caller (?) )... So I suspect a 32/64 bit mismatch... Nothing related to 'race condition' 'timeout' or 'warning'... Of course when I mean this code i mean this way to use sigalrm... I don't printf into my 'real life' callback... :) Again I'm not asking for programming courses... I see the bug, I just want to find why it happen and how to fix it. Btw, thank you for taking time to answer. Cheers |
Quote:
I went through previous posts in this thread and I do not see the following info:
Meanwhile just performing WEB search/browsing I see some 'retq' related bugs. So, maybe your combination of OS + 'gcc' + glibc' + 'binutils' versions is affected by such a bug. An example of such a bug: http://sourceware.org/ml/binutils/2008-03/msg00111.html . I.e. in order to resolve the issue I would try to use different (newer if available) versions of the above tools. ... Why another thread: http://www.linuxquestions.org/questi...broken-803845/ ? |
All times are GMT -5. The time now is 01:12 AM. |