Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game. |
| Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
 |
GNU/Linux Basic Guide
This 255-page guide will provide you with the keys to understand the philosophy of free software, teach you how to use and handle it, and give you the tools required to move easily in the world of GNU/Linux. Many users and administrators will be taking their first steps with this GNU/Linux Basic guide and it will show you how to approach and solve the problems you encounter.
Click Here to receive this Complete Guide absolutely free. |
|
 |
|
06-26-2009, 12:03 AM
|
#1
|
|
Senior Member
Registered: Jan 2006
Location: Charleston, SC, USA
Distribution: Debian Squeeze, Gentoo
Posts: 1,147
Rep:
|
gcc unreachable code
Hello all.
Does anyone know of a way to prevent gcc from optimizing out a piece of unreachable code? Even -O0 still removes it.
I need the code to be there because it is not, in fact, unreachable. Granted, I'm doing some pretty non-standard stuff (libunwind), but there's a label inside the block of "unreachable" code whose address is taken (a gcc extension to C) and stored in a local variable.
Thanks all!
|
|
|
|
06-26-2009, 02:38 AM
|
#2
|
|
Member
Registered: Dec 2007
Location: Alcalá de Henares (Madrid)
Distribution: Debian
Posts: 40
Rep:
|
temporary solution
Oummm, this is not an elegant solution at all, but you can make it reachable by adding an initialisated integer (int dummy= -1 ), and adding something like this to your condition:
if((your code)||(dummy==0)){
}
Last edited by sieira; 06-26-2009 at 02:39 AM.
|
|
|
|
06-26-2009, 04:15 AM
|
#3
|
|
Member
Registered: Sep 2007
Location: Mariposa
Distribution: Debian lenny, Slackware 12
Posts: 806
Rep: 
|
Um, could you please post a tiny example of a compilable, complete program which includes unreachable code which does not remain in the compiled program? That way we can play with it.
|
|
|
|
06-26-2009, 08:06 AM
|
#4
|
|
Senior Member
Registered: Nov 2005
Location: Hanoi
Distribution: Fedora 13, Ubuntu 10.04
Posts: 2,375
Rep: 
|
Quote:
Originally Posted by sieira
Oummm, this is not an elegant solution at all, but you can make it reachable by adding an initialisated integer (int dummy= -1 ), and adding something like this to your condition:
if((your code)||(dummy==0)){
}
|
I would guess that compilers these days are smart enough to realise that the code you gave is unreachable because the variable dummy does not change value.
|
|
|
|
06-26-2009, 10:21 AM
|
#5
|
|
Senior Member
Registered: Nov 2005
Distribution: Debian
Posts: 2,015
|
Quote:
|
Originally Posted by graemef
I would guess that compilers these days are smart enough to realise that the code you gave is unreachable because the variable dummy does not change value.
|
Maybe you could declare the variable dummy volatile?
If the piece of code is a function, perhaps gcc attribute used could help:
From Function-Attributes
Quote:
used
This attribute, attached to a function, means that code must be emitted for the function even if it appears that the function is not referenced. This is useful, for example, when the function is referenced only in inline assembly.
|
|
|
|
|
06-26-2009, 12:26 PM
|
#6
|
|
Guru
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,861
Rep: 
|
PatrickNew -
Quote:
|
could you please post a tiny example of a compilable, complete program which includes unreachable code which does not remain in the compiled program? That way we can play with it.
|
I agree - I'd like to see an example, too.
Thanx in advance .. PSM
|
|
|
|
06-27-2009, 06:12 AM
|
#7
|
|
Senior Member
Registered: May 2005
Posts: 4,386
|
IIRC, if you mark your code with a label (which is declared first - it's a gcc feature, not part of standard "C") AND mark your code with the label AND assign the label address into a global variable, then the code might stay.
Label address operator is '&&' opposed to regular '&'.
The labels stuff is described in 'gcc' manual - among other extesnions.
|
|
|
|
06-27-2009, 05:45 PM
|
#8
|
|
Senior Member
Registered: Jan 2006
Location: Charleston, SC, USA
Distribution: Debian Squeeze, Gentoo
Posts: 1,147
Original Poster
Rep:
|
Code:
#include <stdio.h>
void* global;
void dont_know();
int main()
{
int j = 3;
global = &&main_END;
dont_know();
if(0){
main_END:
printf("Unreachable\n");
}
}
This example code claims that everything inside the if(0) is unreachable. In truth, it's the dont_know() function that performs the jump back into main_END, but the compiler won't know that because dont_know() isn't even implemented (I compiled with -c and ommitted it just to prevent the compiler from being able to analyze it). It is still marked unreachable if the dont_know() stuff is implemented.
So here at least, taking the address of the label isn't enough. My next thought was that it wouldn't mark it if there was a goto to it in the same function. But then I was down to the problem of ensuring that the goto was unreachable, but that the compiler didn't know that.
So far I've identified two things that work, but are hacks. As graemef mentioned, the compiler figures out the dummy variable trick, unless you additionally fool it somehow.
1. The variable can be volatile (thanks ntubski).
2. The variable can be global.
The application of this all is in machine-generated code that does exception handling. So I'm hesitant to add a new variable to every stack frame, especially if it's not doing anything except trick the compiler. Especially a volatile one that will mess up the compiler's optimization. And I'm also hesitant to add a global variable referenced that often because of worries about cache locality in the generated code.
Ultimately, something like __attribute__((used)) is what I was hoping to find, but something that can be applied to a line of code instead of a function.
|
|
|
|
06-27-2009, 07:07 PM
|
#9
|
|
Member
Registered: Sep 2007
Location: Mariposa
Distribution: Debian lenny, Slackware 12
Posts: 806
Rep: 
|
Ok, here's a solution that's scarier than the question. Consider this shell script:
Code:
cat > 2.c <<EOD
void dont_know()
{
(void *)(global());
(*global)();
}
EOD
cat > 1.c <<EOD; gcc 1.c 2.c -o 1; strings 1 | grep xxx
#include <stdio.h>
void *global;
void dont_know();
int main()
{
int j = 3;
global = &&main_END;
dont_know();
printf("xxxaaaaa\n");
for(;;0){
main_END:
printf("xxxUnreachable\n");
}
printf("xxxbbbbb\n");
}
EOD
I'm sure that dont_know() is way out of line, but if I may address your original question:
This compiles the "unreachable" code, but it doesn't seem to compile the line that comes after it. The output is this:
Code:
xxxaaaaa
xxxUnreachable
|
|
|
|
06-27-2009, 07:17 PM
|
#10
|
|
Senior Member
Registered: Nov 2005
Distribution: Debian
Posts: 2,015
|
Quote:
|
In truth, it's the dont_know() function that performs the jump back into main_END, but the compiler won't know that because dont_know() isn't even implemented (I compiled with -c and ommitted it just to prevent the compiler from being able to analyze it). It is still marked unreachable if the dont_know() stuff is implemented.
|
gcc probably assumes you follow this rule (from Labels as Values):
Quote:
|
You may not use this mechanism to jump to code in a different function.
|
Here's a trick: gcc can't analyse inline assembly
Code:
#include <stdio.h>
void* global;
void dont_know();
int main()
{
int j = 3;
global = &&main_END;
dont_know();
asm("jmp skip_unreachable\n");
{
main_END:
printf("Unreachable\n");
}
asm("skip_unreachable:");
}
Quote:
|
Originally Posted by wje_lq
This compiles the "unreachable" code, but it doesn't seem to compile the line that comes after it.
|
I believe you have an infinite loop there.
Last edited by ntubski; 06-27-2009 at 07:19 PM.
Reason: had an extra irrelavant line of code
|
|
|
|
06-27-2009, 08:12 PM
|
#11
|
|
Senior Member
Registered: Jan 2006
Location: Charleston, SC, USA
Distribution: Debian Squeeze, Gentoo
Posts: 1,147
Original Poster
Rep:
|
I don't use a goto to jump back to the address. Rather, I'm using libunwind to put the entire state into a data structure (all registers, including instruction pointer). Then when exceptions are thrown, I modify the instruction pointer to point to the stack unwinding code before resuming from that state.
Quote:
Originally Posted by ntubski
Here's a trick: gcc can't analyse inline assembly
|
I like that idea a lot. I'd prefer a portable solution, but even if that required me to maintain per-platform code, the libunwind I depend on is only portable to 3 architectures, so that's not that big a deal. I like that - it might become my solution.
Thanks all!
|
|
|
|
06-27-2009, 08:29 PM
|
#12
|
|
Senior Member
Registered: Nov 2005
Distribution: Debian
Posts: 2,015
|
What does libunwind give you that setjmp/longjmp doesn't?
|
|
|
|
06-27-2009, 09:28 PM
|
#13
|
|
Member
Registered: Sep 2007
Location: Mariposa
Distribution: Debian lenny, Slackware 12
Posts: 806
Rep: 
|
Quote:
Originally Posted by PatrickNew
I'd prefer a portable solution
|
So does my
trick do anything for you?
|
|
|
|
06-27-2009, 10:46 PM
|
#14
|
|
Senior Member
Registered: May 2005
Posts: 4,386
|
Quote:
Originally Posted by PatrickNew
Code:
#include <stdio.h>
void* global;
void dont_know();
int main()
{
int j = 3;
global = &&main_END;
dont_know();
if(0){
main_END:
printf("Unreachable\n");
}
}
This example code claims that everything inside the if(0) is unreachable. In truth, it's the dont_know() function that performs the jump back into main_END, but the compiler won't know that because dont_know() isn't even implemented (I compiled with -c and ommitted it just to prevent the compiler from being able to analyze it). It is still marked unreachable if the dont_know() stuff is implemented.
So here at least, taking the address of the label isn't enough. My next thought was that it wouldn't mark it if there was a goto to it in the same function. But then I was down to the problem of ensuring that the goto was unreachable, but that the compiler didn't know that.
So far I've identified two things that work, but are hacks. As graemef mentioned, the compiler figures out the dummy variable trick, unless you additionally fool it somehow.
1. The variable can be volatile (thanks ntubski).
2. The variable can be global.
The application of this all is in machine-generated code that does exception handling. So I'm hesitant to add a new variable to every stack frame, especially if it's not doing anything except trick the compiler. Especially a volatile one that will mess up the compiler's optimization. And I'm also hesitant to add a global variable referenced that often because of worries about cache locality in the generated code.
Ultimately, something like __attribute__((used)) is what I was hoping to find, but something that can be applied to a line of code instead of a function.
|
Here a piece of code which apparently works:
Code:
sergei@amdam2:~/junk> cat -n unreachable_code.c
1 #include <stdio.h>
2 void *start_addr;
3 void *end_addr;
4
5 int main()
6 {
7 __label__ start;
8 __label__ end;
9 int j = 3;
10 start_addr = &&start;
11 end_addr = &&end;
12
13 printf("start_addr=%lx end_addr=%lx\n", (unsigned long)start_addr, (unsigned long)end_addr);
14 if((int)end_addr == -1)
15 {
16 goto end;
17 }
18
19 if(0)
20 {
21 start:
22 printf("Unreachable 1, j=%d\n", j);
23 end:
24 printf("Unreachable 2, j=%d\n", j);
25 }
26
27 return 0;
28 }
sergei@amdam2:~/junk> gcc -Wall -Wextra unreachable_code.c -o unreachable_code
sergei@amdam2:~/junk> ./unreachable_code
start_addr=804841a end_addr=804841c
sergei@amdam2:~/junk>
It can be simplified - probably one doesn't need to take addresses of labels.
'goto' in
Code:
14 if((int)end_addr == -1)
15 {
16 goto end;
17 }
will never be executed in reality.
|
|
|
|
06-27-2009, 11:54 PM
|
#15
|
|
Senior Member
Registered: Jan 2006
Location: Charleston, SC, USA
Distribution: Debian Squeeze, Gentoo
Posts: 1,147
Original Poster
Rep:
|
Quote:
Originally Posted by ntubski
What does libunwind give you that setjmp/longjmp doesn't?
|
The reason I'm interested is that I'm writing a compiler for a higher-level language that generates GNU C. The relevant parts of the language are very C++-like, so we can think of this as a C++ compiler for the purposes of this discussion.
As in C++, variables can be declared on the stack, and their associated destructors are run when that variable goes out of scope. This is easy to do when exceptions are not involved, either by manually inserting destructor calls or using gcc's cleanup attribute.
The trouble is with exceptions. My first idea on implementing exceptions was the traditional setjmp() at the try, longjmp at the trow() combination, but it became very difficult to unwind the stack that way. Consider that we don't know how many functions in the call tree there are between the point of throw and the point of catch.
Libunwind solves two big problems for me. The first is the ability to "stop at" every function between throw and catch (and jump to a custom point in that function for unwinding code). The second is the performance characteristics of libunwind vs setjmp()/longjmp(). setjmp() has a non-negligible cost, and most times it is called it is unnecessary since most try blocks exit normally. I prefer a solution that might make throwing an exception expensive, but incurs no cost until an exception is thrown.
@wje_lq I'm going to test that and get back to you.
@Sergei Steshenko I'll test that too. I suspect that the unreachable code might still be removed. The difference between the two pointers is only 2. Even if that 2 is interpreted as 2 words, that's probably not enough code to call a varargs function.
|
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -5. The time now is 03:15 AM.
|
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|