LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 06-26-2009, 12:03 AM   #1
PatrickNew
Senior Member
 
Registered: Jan 2006
Location: Charleston, SC, USA
Distribution: Debian, Gentoo, Ubuntu, RHEL
Posts: 1,148
Blog Entries: 1

Rep: Reputation: 48
gcc unreachable code


Hello all.

Does anyone know of a way to prevent gcc from optimizing out a piece of unreachable code? Even -O0 still removes it.

I need the code to be there because it is not, in fact, unreachable. Granted, I'm doing some pretty non-standard stuff (libunwind), but there's a label inside the block of "unreachable" code whose address is taken (a gcc extension to C) and stored in a local variable.

Thanks all!
 
Old 06-26-2009, 02:38 AM   #2
sieira
Member
 
Registered: Dec 2007
Location: Alcalá de Henares (Madrid)
Distribution: Debian
Posts: 40

Rep: Reputation: 16
temporary solution

Oummm, this is not an elegant solution at all, but you can make it reachable by adding an initialisated integer (int dummy= -1 ), and adding something like this to your condition:

if((your code)||(dummy==0)){

}

Last edited by sieira; 06-26-2009 at 02:39 AM.
 
Old 06-26-2009, 04:15 AM   #3
wje_lq
Member
 
Registered: Sep 2007
Location: Mariposa
Distribution: Debian lenny, Slackware 12
Posts: 808

Rep: Reputation: 178Reputation: 178
Um, could you please post a tiny example of a compilable, complete program which includes unreachable code which does not remain in the compiled program? That way we can play with it.
 
Old 06-26-2009, 08:06 AM   #4
graemef
Senior Member
 
Registered: Nov 2005
Location: Hanoi
Distribution: Fedora 13, Ubuntu 10.04
Posts: 2,376

Rep: Reputation: 147Reputation: 147
Quote:
Originally Posted by sieira View Post
Oummm, this is not an elegant solution at all, but you can make it reachable by adding an initialisated integer (int dummy= -1 ), and adding something like this to your condition:

if((your code)||(dummy==0)){

}
I would guess that compilers these days are smart enough to realise that the code you gave is unreachable because the variable dummy does not change value.
 
Old 06-26-2009, 10:21 AM   #5
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 2,398

Rep: Reputation: 814Reputation: 814Reputation: 814Reputation: 814Reputation: 814Reputation: 814Reputation: 814
Quote:
Originally Posted by graemef
I would guess that compilers these days are smart enough to realise that the code you gave is unreachable because the variable dummy does not change value.
Maybe you could declare the variable dummy volatile?

If the piece of code is a function, perhaps gcc attribute used could help:
From Function-Attributes
Quote:
used
This attribute, attached to a function, means that code must be emitted for the function even if it appears that the function is not referenced. This is useful, for example, when the function is referenced only in inline assembly.
 
Old 06-26-2009, 12:26 PM   #6
paulsm4
Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
PatrickNew -
Quote:
could you please post a tiny example of a compilable, complete program which includes unreachable code which does not remain in the compiled program? That way we can play with it.
I agree - I'd like to see an example, too.

Thanx in advance .. PSM
 
Old 06-27-2009, 06:12 AM   #7
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
IIRC, if you mark your code with a label (which is declared first - it's a gcc feature, not part of standard "C") AND mark your code with the label AND assign the label address into a global variable, then the code might stay.

Label address operator is '&&' opposed to regular '&'.

The labels stuff is described in 'gcc' manual - among other extesnions.
 
Old 06-27-2009, 05:45 PM   #8
PatrickNew
Senior Member
 
Registered: Jan 2006
Location: Charleston, SC, USA
Distribution: Debian, Gentoo, Ubuntu, RHEL
Posts: 1,148
Blog Entries: 1

Original Poster
Rep: Reputation: 48
Code:
#include <stdio.h>
void* global;
void dont_know();
int main()
{
  int j = 3;
  global = &&main_END;
  dont_know();
  if(0){
  main_END:
    printf("Unreachable\n");
  }
}
This example code claims that everything inside the if(0) is unreachable. In truth, it's the dont_know() function that performs the jump back into main_END, but the compiler won't know that because dont_know() isn't even implemented (I compiled with -c and ommitted it just to prevent the compiler from being able to analyze it). It is still marked unreachable if the dont_know() stuff is implemented.

So here at least, taking the address of the label isn't enough. My next thought was that it wouldn't mark it if there was a goto to it in the same function. But then I was down to the problem of ensuring that the goto was unreachable, but that the compiler didn't know that.

So far I've identified two things that work, but are hacks. As graemef mentioned, the compiler figures out the dummy variable trick, unless you additionally fool it somehow.
1. The variable can be volatile (thanks ntubski).
2. The variable can be global.

The application of this all is in machine-generated code that does exception handling. So I'm hesitant to add a new variable to every stack frame, especially if it's not doing anything except trick the compiler. Especially a volatile one that will mess up the compiler's optimization. And I'm also hesitant to add a global variable referenced that often because of worries about cache locality in the generated code.

Ultimately, something like __attribute__((used)) is what I was hoping to find, but something that can be applied to a line of code instead of a function.
 
Old 06-27-2009, 07:07 PM   #9
wje_lq
Member
 
Registered: Sep 2007
Location: Mariposa
Distribution: Debian lenny, Slackware 12
Posts: 808

Rep: Reputation: 178Reputation: 178
Ok, here's a solution that's scarier than the question. Consider this shell script:
Code:
cat > 2.c <<EOD
void dont_know()
{
  (void *)(global());
  (*global)();
}
EOD
cat > 1.c <<EOD; gcc 1.c 2.c -o 1; strings 1 | grep xxx
#include <stdio.h>
void *global;
void dont_know();
int main()
{
  int j = 3;
  global = &&main_END;
  dont_know();
  printf("xxxaaaaa\n");
  for(;;0){
  main_END:
    printf("xxxUnreachable\n");
  }
  printf("xxxbbbbb\n");
}
EOD
I'm sure that dont_know() is way out of line, but if I may address your original question:

This compiles the "unreachable" code, but it doesn't seem to compile the line that comes after it. The output is this:
Code:
xxxaaaaa
xxxUnreachable
 
Old 06-27-2009, 07:17 PM   #10
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 2,398

Rep: Reputation: 814Reputation: 814Reputation: 814Reputation: 814Reputation: 814Reputation: 814Reputation: 814
Quote:
In truth, it's the dont_know() function that performs the jump back into main_END, but the compiler won't know that because dont_know() isn't even implemented (I compiled with -c and ommitted it just to prevent the compiler from being able to analyze it). It is still marked unreachable if the dont_know() stuff is implemented.
gcc probably assumes you follow this rule (from Labels as Values):
Quote:
You may not use this mechanism to jump to code in a different function.
Here's a trick: gcc can't analyse inline assembly
Code:
#include <stdio.h>

void* global;
void dont_know();

int main()
{
    int j = 3;
    global = &&main_END;
    dont_know();

    asm("jmp skip_unreachable\n");
    {
    main_END:
        printf("Unreachable\n");
    }

    asm("skip_unreachable:");
}
Quote:
Originally Posted by wje_lq
This compiles the "unreachable" code, but it doesn't seem to compile the line that comes after it.
I believe you have an infinite loop there.

Last edited by ntubski; 06-27-2009 at 07:19 PM. Reason: had an extra irrelavant line of code
 
Old 06-27-2009, 08:12 PM   #11
PatrickNew
Senior Member
 
Registered: Jan 2006
Location: Charleston, SC, USA
Distribution: Debian, Gentoo, Ubuntu, RHEL
Posts: 1,148
Blog Entries: 1

Original Poster
Rep: Reputation: 48
I don't use a goto to jump back to the address. Rather, I'm using libunwind to put the entire state into a data structure (all registers, including instruction pointer). Then when exceptions are thrown, I modify the instruction pointer to point to the stack unwinding code before resuming from that state.

Quote:
Originally Posted by ntubski View Post
Here's a trick: gcc can't analyse inline assembly
I like that idea a lot. I'd prefer a portable solution, but even if that required me to maintain per-platform code, the libunwind I depend on is only portable to 3 architectures, so that's not that big a deal. I like that - it might become my solution.

Thanks all!
 
Old 06-27-2009, 08:29 PM   #12
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 2,398

Rep: Reputation: 814Reputation: 814Reputation: 814Reputation: 814Reputation: 814Reputation: 814Reputation: 814
What does libunwind give you that setjmp/longjmp doesn't?
 
Old 06-27-2009, 09:28 PM   #13
wje_lq
Member
 
Registered: Sep 2007
Location: Mariposa
Distribution: Debian lenny, Slackware 12
Posts: 808

Rep: Reputation: 178Reputation: 178
Quote:
Originally Posted by PatrickNew View Post
I'd prefer a portable solution
So does my
Code:
for(;;0){
trick do anything for you?
 
Old 06-27-2009, 10:46 PM   #14
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by PatrickNew View Post
Code:
#include <stdio.h>
void* global;
void dont_know();
int main()
{
  int j = 3;
  global = &&main_END;
  dont_know();
  if(0){
  main_END:
    printf("Unreachable\n");
  }
}
This example code claims that everything inside the if(0) is unreachable. In truth, it's the dont_know() function that performs the jump back into main_END, but the compiler won't know that because dont_know() isn't even implemented (I compiled with -c and ommitted it just to prevent the compiler from being able to analyze it). It is still marked unreachable if the dont_know() stuff is implemented.

So here at least, taking the address of the label isn't enough. My next thought was that it wouldn't mark it if there was a goto to it in the same function. But then I was down to the problem of ensuring that the goto was unreachable, but that the compiler didn't know that.

So far I've identified two things that work, but are hacks. As graemef mentioned, the compiler figures out the dummy variable trick, unless you additionally fool it somehow.
1. The variable can be volatile (thanks ntubski).
2. The variable can be global.

The application of this all is in machine-generated code that does exception handling. So I'm hesitant to add a new variable to every stack frame, especially if it's not doing anything except trick the compiler. Especially a volatile one that will mess up the compiler's optimization. And I'm also hesitant to add a global variable referenced that often because of worries about cache locality in the generated code.

Ultimately, something like __attribute__((used)) is what I was hoping to find, but something that can be applied to a line of code instead of a function.

Here a piece of code which apparently works:

Code:
sergei@amdam2:~/junk> cat -n unreachable_code.c
     1  #include <stdio.h>
     2  void *start_addr;
     3  void *end_addr;
     4
     5  int main()
     6    {
     7    __label__ start;
     8    __label__ end;
     9    int j = 3;
    10    start_addr = &&start;
    11    end_addr = &&end;
    12
    13    printf("start_addr=%lx end_addr=%lx\n", (unsigned long)start_addr, (unsigned long)end_addr);
    14    if((int)end_addr == -1)
    15      {
    16      goto end;
    17      }
    18
    19    if(0)
    20      {
    21  start:
    22      printf("Unreachable 1, j=%d\n", j);
    23  end:
    24      printf("Unreachable 2, j=%d\n", j);
    25      }
    26
    27    return 0;
    28    }
sergei@amdam2:~/junk> gcc -Wall -Wextra unreachable_code.c -o unreachable_code
sergei@amdam2:~/junk> ./unreachable_code
start_addr=804841a end_addr=804841c
sergei@amdam2:~/junk>
It can be simplified - probably one doesn't need to take addresses of labels.

'goto' in
Code:
    14    if((int)end_addr == -1)
    15      {
    16      goto end;
    17      }
will never be executed in reality.
 
Old 06-27-2009, 11:54 PM   #15
PatrickNew
Senior Member
 
Registered: Jan 2006
Location: Charleston, SC, USA
Distribution: Debian, Gentoo, Ubuntu, RHEL
Posts: 1,148
Blog Entries: 1

Original Poster
Rep: Reputation: 48
Quote:
Originally Posted by ntubski View Post
What does libunwind give you that setjmp/longjmp doesn't?
The reason I'm interested is that I'm writing a compiler for a higher-level language that generates GNU C. The relevant parts of the language are very C++-like, so we can think of this as a C++ compiler for the purposes of this discussion.

As in C++, variables can be declared on the stack, and their associated destructors are run when that variable goes out of scope. This is easy to do when exceptions are not involved, either by manually inserting destructor calls or using gcc's cleanup attribute.

The trouble is with exceptions. My first idea on implementing exceptions was the traditional setjmp() at the try, longjmp at the trow() combination, but it became very difficult to unwind the stack that way. Consider that we don't know how many functions in the call tree there are between the point of throw and the point of catch.

Libunwind solves two big problems for me. The first is the ability to "stop at" every function between throw and catch (and jump to a custom point in that function for unwinding code). The second is the performance characteristics of libunwind vs setjmp()/longjmp(). setjmp() has a non-negligible cost, and most times it is called it is unnecessary since most try blocks exit normally. I prefer a solution that might make throwing an exception expensive, but incurs no cost until an exception is thrown.

@wje_lq I'm going to test that and get back to you.

@Sergei Steshenko I'll test that too. I suspect that the unreachable code might still be removed. The difference between the two pointers is only 2. Even if that 2 is interpreted as 2 words, that's probably not enough code to call a varargs function.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
porting code to gcc ..template code problems aayudh Programming 1 01-17-2009 03:40 PM
gcc/Assembly code gigantas1985 Programming 0 02-28-2006 05:52 AM
C code compiles on windows, but not with gcc MDBlueIce Programming 6 05-23-2005 04:33 PM
LD linker doesn't remove unreachable code ? vips Programming 3 09-08-2004 08:12 AM
Return code from main() using gcc Meatwad Programming 13 01-27-2004 06:36 PM


All times are GMT -5. The time now is 05:13 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration