LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
LinkBack Search this Thread
Old 02-02-2012, 02:45 PM   #1
Redrobes
LQ Newbie
 
Registered: Feb 2012
Posts: 9

Rep: Reputation: Disabled
GCC and why is there strings in the binary ?


Hello, im a long time programmer for windows but newbi to compiling linux apps and want to port one of my free-ware but closed source applications to linux of which I have done, and I have it all working just fine.

My concern however is that the resulting binary is full of strings from the source file which constitutes an IP release. This is a compilation without debugging information set (no -g) in fact I can set -g0 also to ensure no debug and its the same.

My app is quite large, uses many classes with some virtual functions, uses some shared libraries like math, GL, SDL etc and links a few more of the wider licensed ones statically like libz.a

Aside from the strings I would expect in the binary, such as error messages, I can also see class names, function names and even source file names. I have tried to use the -O2 or -Os optimization flags as well as tried to mark the visibility as hidden. I have used the -s strip option as well as calling the Strip util on the exe. All to no avail.

I have also done some 'strings' call on some of the apps which ship with Ubuntu such as chess and Quadrapassel games and they also have some filename and function names in the binary too.

For into, I'm using Ubuntu 10.04 with its default gcc so I am using g++ 4.4.3-4 to compile but I am doing this in preparation to go to an ARM based embedded PC under debian.

So to the questions.... is it normal that gcc is stuffing the binary with this information ? Is there any way to remove it ? Do I have to use a code obfuscator to mangle the source before compilation in order to mitigate against it ?

Hope you can help !
 
Old 02-02-2012, 02:51 PM   #2
Redrobes
LQ Newbie
 
Registered: Feb 2012
Posts: 9

Original Poster
Rep: Reputation: Disabled
Now I have posted this, the forum is suggesting this post:
http://www.linuxquestions.org/questi...ptions-831917/

This is not what I need. I'm ok with all the hard coded strings I put into the app myself. Its the source code class names, private function names, and source code file names which I object to being there.
 
1 members found this post helpful.
Old 02-02-2012, 03:19 PM   #3
theNbomr
Senior Member
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 4,506

Rep: Reputation: 602Reputation: 602Reputation: 602Reputation: 602Reputation: 602Reputation: 602
If your application is not statically linked, it will have public symbols embedded which will be used to link with shared object libraries. Is that what you are talking about?

--- rod.
 
1 members found this post helpful.
Old 02-02-2012, 06:41 PM   #4
Redrobes
LQ Newbie
 
Registered: Feb 2012
Posts: 9

Original Poster
Rep: Reputation: Disabled
Hi Rod & thanks for posting but its not the shared library strings that is at issue here. If I take the final release binary from a gcc build and I type: 'nm app.exe' then it lists off all of the ELF sections and they include sections labelled with the file names of the source file. All of the names of the class members are there too. Ok. If I go "strip -o Stripped.exe Orig.exe" then it generates a smaller file. If I go "nm Stripped.exe" then it says "No symbols" great ! BUT... if I then go "strings Stripped.exe | grep cpp" then it lists all of the filenames used to make up the app again. Using strings, I can see many function names and class names.

So not just member functions for shared libraries but many member function names and file names in there.

As an aside it seems from many forum posts that you can try to static link an app but gcc will try to link with libgcc and libc shared anyway and even if you override that then its unsafe because of gcc ABI changes with different versions as they interact with the kernel. So not sure its possible to static link an app in gcc properly anyway.
 
1 members found this post helpful.
Old 02-02-2012, 07:31 PM   #5
Redrobes
LQ Newbie
 
Registered: Feb 2012
Posts: 9

Original Poster
Rep: Reputation: Disabled
Here is a concrete example:

Code:
#include <stdio.h>


class Red
{
public:
	Red() { }
	
	virtual void HelloWorld() = 0;
};


class Robes : public Red
{
public:
	Robes() { }
	
	virtual void HelloWorld();
};


void Robes::HelloWorld()
{
	printf( "Hello World\n" );
}


int main( int argc, char *argv[] )
{
	Robes r;
	r.HelloWorld();
	return 0;
}
compile with:
g++ -Wall -O2 -c Redrobes.cpp -o objs/Redrobes.o

link with:
g++ -fmax-errors=20 -Wl,-O,-s,--gc-sections objs/Redrobes.o -o HelloWorld.exe

i.e no specific shared libs this time. Then:
strip -o Stripped.exe HelloWorld.exe

and:
strings Stripped.exe

produces at the end:
...
Hello World
5Robes
3Red


Now the "Hello World" is just fine as that is my const string. The Robes and the Red should not be in the file. Its the class name. In this example the filename is not appearing but in a larger app, loads more source strings are in the binary - its chock full of it. I think its something to do with the ELF section naming conventions. Can anyone elaborate how it generates them or how to not forward source strings or filenames of source code into the ELF section names.
 
1 members found this post helpful.
Old 02-03-2012, 10:26 AM   #6
gnashley
Amigo developer
 
Registered: Dec 2003
Location: Germany
Distribution: Slackware
Posts: 4,434

Rep: Reputation: 303Reputation: 303Reputation: 303Reputation: 303
Use 'sstrip' to strip harder, or 'strip --remove-section=sectionname' to remove specific sections.
 
1 members found this post helpful.
Old 02-03-2012, 10:02 PM   #7
Redrobes
LQ Newbie
 
Registered: Feb 2012
Posts: 9

Original Poster
Rep: Reputation: Disabled
Thanks, thats good advice and it does shrink my binary a little more. But its still not removing the strings. I have downloaded LLVM and compiled up the latest version of that and its still in that binary too. It seems that its necessary for there to be that section but I suspect that it does not have to have that name specifically. Well, worst come I could obfuscate it manually before compile but that seems well overkill.
 
1 members found this post helpful.
Old 02-06-2012, 05:03 PM   #8
tuxdev
Senior Member
 
Registered: Jul 2005
Distribution: Slackware
Posts: 1,997

Rep: Reputation: 107Reputation: 107
What you're seeing is the mangled names from RTTI. No amount of stripping is going to take those out, because the compiler/linker can't actually determine that there's no way for .name() on the types will ever get called.

Don't go source-mangling, it is not worth the headaches. You're trying to use a technical solution for a legal problem. Same reason why DRM never works out. The only way to avoid IP issues completely.. is to not release *anything*.
 
2 members found this post helpful.
Old 02-06-2012, 07:44 PM   #9
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 1,421

Rep: Reputation: 360Reputation: 360Reputation: 360Reputation: 360
Quote:
Originally Posted by tuxdev View Post
What you're seeing is the mangled names from RTTI. No amount of stripping is going to take those out, because the compiler/linker can't actually determine that there's no way for .name() on the types will ever get called.
Perhaps -fno-rtti would help, then?
 
1 members found this post helpful.
Old 02-07-2012, 01:47 PM   #10
tuxdev
Senior Member
 
Registered: Jul 2005
Distribution: Slackware
Posts: 1,997

Rep: Reputation: 107Reputation: 107
Quote:
Perhaps -fno-rtti would help, then?
It's pretty unlikely for a non-trivial C++ program to not use RTTI in some form, especially for what looks like a game.
 
1 members found this post helpful.
Old 02-07-2012, 03:41 PM   #11
Redrobes
LQ Newbie
 
Registered: Feb 2012
Posts: 9

Original Poster
Rep: Reputation: Disabled
Ok thanks that does sound like a likely candidate reason. What I have been noticing as I am including more and more static libraries of my own code or free libs like PNG into the main app is that a) C based code does not seem to appear at all and also b) its a very low amount on any library code compared to the main code. As I move some of the code out of the main core and into the proper placed library sections and link it back in the amount of string info is going down. I also note that it seems to be prevalent on virtual functions more than anything else.

In terms of RTTI, though it might get used as part of normal C++ type use or overloading etc, I don't personally use it much or at all. I those instances I would normally define an enum in the base class and pass one of them on a constructor and switch / cast the type manually. I know that might not be a 'best practice' way but I keep control over the way it determines the type. I'll definitely try some options around the RTTI and see what happens. If it were RTTI then I could account for the class names being there but the filename.cpp names still being there seems a bit iffy. Maybe I have botched some aspect of debug where __FILE__ is being introduced in non debug cases.

Where I could be sure about how I might use RTTI, I don't know if SDL might use it. I don't know if I can switch off RTTI just for the main code and on for some libs. If its per file then I could add a make option for the main code or each lib but I suspect that it would be a link option. I'll check it out.

Thanks for posting tho. Really helpful.
 
1 members found this post helpful.
Old 02-07-2012, 07:25 PM   #12
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 1,421

Rep: Reputation: 360Reputation: 360Reputation: 360Reputation: 360
Quote:
Originally Posted by Redrobes View Post
Where I could be sure about how I might use RTTI, I don't know if SDL might use it.
Since SDL is a plain C library, you can be certain it doesn't use RTTI.
 
Old 02-13-2012, 07:03 PM   #13
Redrobes
LQ Newbie
 
Registered: Feb 2012
Posts: 9

Original Poster
Rep: Reputation: Disabled
I have tried the -fno-rtti option and it has indeed removed all the unwarranted strings from the binary. So a combination of a few options but mainly this one and the question can be marked [SOLVED] now. Thanks very much. I have ported my one app successfully now. Ill do the others and release a full set as Linux now along side the Windows set in due course.

For those interested, I used:

compiler:
-Wall -fno-rtti -Os -fdata-sections -ffunction-sections

and

linker:
-Wl,-O,-s,--gc-sections
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Is it Possible to Obscure Strings (C++ compiled binary) Automatically (G++ options) ? Repgahroll Programming 5 09-13-2010 03:02 PM
'strings' binary from binutils - how do you use it? mihk Programming 10 07-15-2010 02:00 PM
java to binary with gcc? kalleanka Programming 5 01-02-2009 07:55 AM
Extracting ASCII strings from a Binary files poorrej Linux - Newbie 2 10-31-2008 03:38 AM
need binary gcc phoenix_wolf Linux - Software 1 12-02-2004 03:16 AM


All times are GMT -5. The time now is 04:35 AM.

Main Menu
 
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: @linuxquestions
Open Source Consulting | Domain Registration