LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 10-28-2016, 11:41 AM   #1
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,225

Rep: Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320
How to reverse-engineer binaries?


I'm interested in learning how to reverse engineer binaries. Specifically, I'd like to master the use of gdb and other tools to find out which system calls they're executing at each stage, and which resources they're using and/or loading. I'm only interested in doing this on Linux.

Which links and/or books would you recommend?

Last edited by dugan; 10-28-2016 at 11:45 AM.
 
Old 10-29-2016, 07:51 AM   #2
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Using "gdb and other tools to find out which system calls" is not reverse engineering. These are used for debugging a program.

Reverse engineering (properly) is a three to four step operation and requires at least three people.

1. identify all input, outputs of a program, and document them (person 1)
2. identify the functions carried out by the program and document them (person 1)
3. re-implement the functions based on the documentation from steps 1 and 2. (person 2)
4. have a lawyer review the process as the project continues. (person 3)

The problem with "reengineering" using the debugging tools is that this only gives you a derivative of the original programming, along with the copyright restrictions of the original program.

Even doing it properly is still subject to software patents (even if they are stupid).

The problem with one person doing it is that the implementation person MAY include copyrighted code from steps 1 and 2 when using some of the debugging tools/dump analysis tools, thus contaminating the result with copyright restrictions. This was the charge against the Samba project as it remimplemented AD and CIFS, even though the only thing being examined was the network packets being passed around.

Now for finding out what most Linux do (the open source ones), just get the source and look. Besides seeing the code, you also get any comments that might explain why something was done the way it was done. Who knows - you may even come up with a better way. If that happens, submit it as an improvement...

As for books, you can start with:
https://www.google.com/search?q=text...utf-8&oe=utf-8
or the PDF format
https://beginners.re/

NOTE: this one is with a Russian point of view, and does not cover any legal ramifications. Some of the activities would be counted as illegal (as in breaking any license you have on proprietary software - specifically about dissassembling the code via either debugger or other tools).

but google for others - there are quite a few

Last edited by jpollard; 10-29-2016 at 07:58 AM.
 
Old 10-30-2016, 01:21 PM   #3
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,659
Blog Entries: 4

Rep: Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941
This would be better described as "blackbox forensic analysis." You really aren't trying to disassemble anything, nor to change the way in which the code executes (as "debuggers" try to do). Instead, you are gathering observations: when a resource is loaded, log the fact that it did so. When a system call occurs, log it. And so on.

Sometimes this is done using a modified kernel, or, if the calls are actually being made through a loaded userland library (as is commonly the case), a logging-enabled version of that library. The program itself is generally untouched.

"Disassembling to produce replacement source code" is made quite problematic by optimizing compilers, which produce object-code sequences that may be quite different from the source-code but which are functionally equivalent to it.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Reverse-engineering security-oriented LPS liveCD to add or remove binaries etc Ulysses_ Linux - Security 18 08-27-2014 10:29 PM
How to do USB reverse engineering? Ogi Programming 7 02-24-2010 12:48 PM
Reverse engineering in firebird brevleq Linux - Software 1 10-01-2008 12:39 PM
Reverse engineering code barrythai SUSE / openSUSE 4 09-08-2005 05:29 AM
reverse engineering walterw Programming 3 01-18-2003 04:15 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:19 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration