LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 12-23-2015, 06:15 PM   #1
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,223

Rep: Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320
What do you use to analyse a C/C++ codebase?


What tools do you use to analyze and understand C and C++ projects? So far I've only been using cscope and ack (a grep-like tool).
 
Old 12-23-2015, 08:24 PM   #2
NoStressHQ
Member
 
Registered: Apr 2010
Location: Geneva - Switzerland ( Bordeaux - France / Montreal - QC - Canada)
Distribution: Slackware 14.2 - 32/64bit
Posts: 609

Rep: Reputation: 221Reputation: 221Reputation: 221
Quote:
Originally Posted by dugan View Post
What tools do you use to analyze and understand C and C++ projects? So far I've only been using cscope and ack (a grep-like tool).
Hard to find a free tool for that. I use Doxygen sometimes, need some tweaking but it ca do some nice doc with basic class diagram.

But at the end I often do "human crawling" as it's still more efficient (reading the code and take some notes of the important components. Basic reverse engineering but with the source code, still easier that disassembling .

Good luck.

Garry.
 
Old 12-23-2015, 08:33 PM   #3
Ztcoracat
LQ Guru
 
Registered: Dec 2011
Distribution: Slackware, MX 18
Posts: 9,484
Blog Entries: 15

Rep: Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176
Maybe try NCC?

http://ask.slashdot.org/story/08/01/...rstanding-code
http://codist.tripod.com/

"Understand" is a static analysis tool for maintaining, measuring, and analyzing:-
http://alternativeto.net/software/codenavigator/

There are some tools on this page 'similar to grep'. Not sure if they are of any use.
http://www.gnu.org/software/global/links.html

Gave it my best-
HTH
 
Old 12-23-2015, 08:41 PM   #4
ttk
Senior Member
 
Registered: May 2012
Location: Sebastopol, CA
Distribution: Slackware64
Posts: 1,038
Blog Entries: 27

Rep: Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484
I use find(1), grep(1), gdb(1) and sometimes make notes in a text editor, particularly to collapse excessive layers of abstraction -- I'll put a function or structure type name at indentation level 0, paste in the points of interest at level 1, and then follow the calls and put their points of interest at indentation level 2, and then follow their calls etc. So I end up with all of the points of interest and their dependencies on a single page, each line annotated with file names and notes. This usually gives me a much clearer idea of what's going on.

I'll also mark up code with comments as I discover things, else I'll forget them (if not that very session, then days or weeks or months later).

Since I learned programming via Wirth's data structure centric philosophy, I'll often start by finding the data types which interest me, and then grep around for everywhere those types are used. If I can understand the code as transformations of data, then remembering every detail of the code's implementation is less important, and I can think about the program in terms of intention.

This is a lot easier with C than with Python, which will often interpose ludicrous amounts of indirection between types and their use. Sometimes it's impossible to tell from looking at Python code what type a variable will reference, and I have to insert logging statements and actually run it to learn more. This is seldom necessary with C, and when it is, gdb works great for exploring runtime state.
 
1 members found this post helpful.
Old 12-23-2015, 09:51 PM   #5
Richard Cranium
Senior Member
 
Registered: Apr 2009
Location: McKinney, Texas
Distribution: Slackware64 15.0
Posts: 3,858

Rep: Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225
Quote:
Originally Posted by ttk View Post
This is a lot easier with C than with Python, which will often interpose ludicrous amounts of indirection between types and their use. Sometimes it's impossible to tell from looking at Python code what type a variable will reference, and I have to insert logging statements and actually run it to learn more. This is seldom necessary with C, and when it is, gdb works great for exploring runtime state.
Actually, Python's classes are little more than suggestions. It is absolutely possible to pick an object and replace the implementation of any or all of its methods with unique code.

Not many people actually do that, but it certainly can be done. I've done it (snip from some code that I wrote quite a while back)...
Code:
config_file = ConfigParser.ConfigParser()
# The next line replaces the optionxform method with the str()
# function. Wild. That (BTW) makes the options lookup case sensitive.
config_file.optionxform = str
That actually makes unit testing and code coverage very important with Python code.
 
Old 12-23-2015, 09:58 PM   #6
Richard Cranium
Senior Member
 
Registered: Apr 2009
Location: McKinney, Texas
Distribution: Slackware64 15.0
Posts: 3,858

Rep: Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225
And to provide an actual comment for the OP (sorry for the derail above), you could use Emacs with CEDET and ECB (see https://www.logilab.org/blogentry/173886 for some discussion as well as http://cedet.sourceforge.net/ and http://ecb.sourceforge.net/screenshots/index.html). When I was still doing C++ development, I used ebrowse (https://www.gnu.org/software/emacs/m...ode/ebrowse/); however that was almost 16 years ago.

Eclipse appears to support C/C++ projects as well as Netbeans.
 
Old 12-24-2015, 01:53 AM   #7
a4z
Senior Member
 
Registered: Feb 2009
Posts: 1,727

Rep: Reputation: 742Reputation: 742Reputation: 742Reputation: 742Reputation: 742Reputation: 742Reputation: 742
the documentation if exists, the with coming samples if exist, an IDE with proper code navigation, nearly all these days but eclipse CDT is very good, a debugger, ask the author.
and simply work with the code, it's always the same, at the begin you think WTF, than after some pain it becomes better, and if you don't give up after some time and more pain you will start understand it or find out that it is a organically grown mass of lines of code that accidental do something
 
Old 12-24-2015, 02:49 AM   #8
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,263
Blog Entries: 24

Rep: Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194
I am with these guys and gals...

Quote:
Originally Posted by NoStressHQ View Post
...I often do "human crawling" as it's still more efficient (reading the code and take some notes of the important components. Basic reverse engineering but with the source code...
Quote:
Originally Posted by Ztcoracat View Post
Gave it my best-
HTH
Which I will slightly transform into, "Give it my best"!

Quote:
Originally Posted by ttk View Post
I use find(1), grep(1), gdb(1) and sometimes make notes in a text editor, particularly to collapse excessive layers of abstraction -- [description of notation methods] ...So I end up with all of the points of interest and their dependencies on a single page, each line annotated with file names and notes. This usually gives me a much clearer idea of what's going on.

I'll also mark up code with comments as I discover things, else I'll forget them (if not that very session, then days or weeks or months later).

...grep around... If I can understand the code as transformations of data, then remembering every detail of the code's implementation is less important, and I can think about the program in terms of intention.

NOTE: I do this but didn't know it was a method with a name!
Which I will summarize as follows: Generate well organized, useful notes to serve as a kind of model, comment code inline when you actually come to understand it, and try to understand what it does before worrying about how it does it!

Quote:
Originally Posted by a4z View Post
...simply work with the code, it's always the same, at the begin you think WTF, than after some pain it becomes better, and if you don't give up after some time and more pain you will start understand it or find out that it is a organically grown mass of lines of code that accidental do something
I never adapted to using an IDE and find the *nix built-ins preferable for most of my own purposes - so vim, find, grep, etc... and I make lots of notes!

For other people's code I usually try to work up a few simple annotated UML-type diagrams, mostly use case, class and sequence, sufficient to cover my area of interest - rarely more complete. This is the only way I can maintain the continuity for more than single session!

For my own projects I have found that it is always worthwhile to start with similar simple, but more complete, modeling diagrams and notes, and to keep them in sync with the real world as I write code.

Diagnostics, mostly an interactive style of debugging, inserting messages and break points. And valgrind to keep memory leaks and execution choke points under control. I use gdb at times but usually find I get by with my own ecclectic collection of habits and methods.

My own most valuable tips:

* Organize project spaces from a shell, never a graphical interface, within a purpose built directory structure.
* Use a terminal mux like screen or tmux to allow efficient edit, search, reference operations using those wonderful *nix built-ins, always at your fingertips.
* Write your own makefiles at least in early project phases - it has a terrific focusing effect! Later if you use autotools or other makefile generator framework you can make it do what you want instead of spending several days figuring out what it wants you to do!
* Write code and validate against your model in smallish, well defined increments - you never get lost that way!
* Write and maintain test cases and code in parallel with the project code - always!

I find these things put my code much more clearly in mind and speed development literally at every keystroke.
 
2 members found this post helpful.
Old 12-24-2015, 11:55 AM   #9
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,223

Original Poster
Rep: Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320
I've just discovered codequery and I'll be trying it out soon. I like the fact that it uses cscope as part of its backend.
 
Old 12-24-2015, 12:15 PM   #10
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,223

Original Poster
Rep: Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320
Quote:
Originally Posted by Richard Cranium View Post
That actually makes unit testing and code coverage very important with Python code.
Oh, absolutely. Having 100 unit test coverage but no asserts, for a Python codebase, is as good as compiling a static-language codebase.

It's also true that one of the tradeoffs with dynamically typed languages is that static-analysis tools are much less powerful. That's something I would consider when choosing between go and nodejs.

Last edited by dugan; 12-24-2015 at 01:05 PM.
 
Old 12-24-2015, 01:28 PM   #11
Richard Cranium
Senior Member
 
Registered: Apr 2009
Location: McKinney, Texas
Distribution: Slackware64 15.0
Posts: 3,858

Rep: Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225
You can get the worst of both worlds with Groovy.
 
Old 12-24-2015, 04:39 PM   #12
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,838

Rep: Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308
yes, mainly the documentation and probably the original creators/owners may give you helpful answers. Anyway, it depends on the size, timeframe, the costs and the reasons too.
 
Old 12-26-2015, 09:57 AM   #13
tronayne
Senior Member
 
Registered: Oct 2003
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,541

Rep: Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065
Because I'm old and been doing this for a long, long time I have a fondness for lint.

lint? Yeah, lint. Ain't no lint in Linux but there is splint. Oldie, goodie, freebie, works well enough for me: http://www.splint.org/

Builds clean in Slackware 32- and 64 bit, runs just fine in Slackware 64-bit 14.1.

Yammers at you about everything, just like Unix lint does.

Here's an example, a function for convert degrees, minutes, and seconds to decimal degrees (for map making):
Code:
cat dms2deg.c

#ident	"$Id: dms2deg.c,v 1.1.1.1 2009/10/07 17:59:37 trona Exp $"

/*
 *	Copyright (C) 2000-2009 Thomas Ronayne
 *
 *	This program is free software; you can redistribute it and/or
 *	modify it under the terms of version 2 of the GNU General
 *	Public License as published by the Free Software Foundation.
 *
 *	This program is distributed in the hope that it will be useful,
 *	but WITHOUT ANY WARRANTY; without even the implied warranty of
 *	MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
 *	General Public License for more details.
 *
 *	You should have received a copy of the GNU General Public
 *	License along with this program; if not, write to the Free
 *	Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
 *	MA 02111-1307, USA.
 *
 *	Name:		$Source: /usr/local/cvsroot/gnis/dms2deg.c,v $
 *	Purpose:	convert DDD MM SS to decimal degrees
 *	Version:	$Revision: 1.1.1.1 $
 *	Modified:	$Date: 2009/10/07 17:59:37 $
 *	Author:		T. N. Ronayne
 *	Date:		21 Jul 2009
 *	$Log: dms2deg.c,v $
 *	Revision 1.1.1.1  2009/10/07 17:59:37  trona
 *	initial installation Slackware 13.0
 *	
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <unistd.h>
#include "gnis.h"

#ifndef	TRUE
#	define	TRUE	1
#endif
#ifndef	FALSE
#	define	FALSE	0
#endif

void	main	(int argc, char *argv [])
{
	char	where [2];		/* N S E W			*/
	int	c;			/* general-purpose		*/
	int	error = FALSE;		/* error flag			*/
	int	vopt = FALSE;		/* verbose option		*/
	double	degree = 0.0;		/* degree			*/
	double	minute = 0.0;		/* minute			*/
	double	second = 0.0;		/* second			*/
	time_t	t0 = (time_t) 0;	/* start time			*/
	time_t	t1 = (time_t) 0;	/* finish time			*/
	FILE	*in;

	/*	process the command line arguments			*/
	while ((c = getopt (argc, argv, "?d:m:s:w:v")) != EOF) {
		switch (c) {
		case '?':
			error = TRUE;
			break;
		case 'd':
			degree = strtod (optarg, (char **) NULL);
			break;
		case 'm':
			minute = strtod (optarg, (char **) NULL);
			break;
		case 's':
			second = strtod (optarg, (char **) NULL);
			break;
		case 'w':
			(void) strcpy (where, optarg);
			break;
		case 'v':
			vopt = TRUE;
			break;
		default:
			(void) fprintf (stderr, "getopt() bug\n");
			exit (EXIT_FAILURE);
		}
	}
	/*	any errors in the arguments, or a '?' entered...*/
	if (error) {
		(void) fprintf (stderr, "usage: s [-v] argument...\n",
		    argv [0]);
		exit (EXIT_FAILURE);
	}
	/*	get a start time				*/
	if (time (&t0) < (time_t) 0)
		(void) fprintf (stderr,
		    "%s:\tcan't read system clock\n", argv [0]);
	(void) fprintf (stdout,
	     "%.0lf:%.0lf:%.0lf %s = %.8lf\n",
	     degree, minute, second, where,
	     dmsdeg (degree, minute, second, where[0]));
	/*	get a finish time			*/
	if (time (&t1) < (time_t) 0)
		(void) fprintf (stderr,
		    "%s:\tcan't read system clock\n", argv [0]);
	if (vopt)
		(void) fprintf (stderr,
		    "%s duration %g seconds\n",
		    argv [0], difftime (t1, t0));
	exit (EXIT_SUCCESS);
}
Here's what splint says about it:
Code:
splint dms2deg.c
Splint 3.1.2 --- 26 Dec 2015

dms2deg.c:36: Include file <unistd.h> matches the name of a POSIX library, but
    the POSIX library is not being used.  Consider using +posixlib or
    +posixstrictlib to select the POSIX library, or -warnposix to suppress this
    message.
  Header name matches a POSIX header, but the POSIX library is not selected.
  (Use -warnposixheaders to inhibit warning)
dms2deg.c:47:6: Function main declared to return void, should return int
  The function main does not match the expected type. (Use -maintype to inhibit
  warning)
dms2deg.c: (in function main)
dms2deg.c:58:8: Variable in shadows outer declaration
  An outer declaration is shadowed by the local declaration. (Use -shadow to
  inhibit warning)
   gnis.h:159:7: Previous definition of in: FILE *
dms2deg.c:87:6: Test expression for if not boolean, type int: error
  Test expression type is not boolean or int. (Use -predboolint to inhibit
  warning)
dms2deg.c:99:39: Array element where[0] used before definition
  An rvalue is used that may not be initialized to a value on some execution
  path. (Use -usedef to inhibit warning)
dms2deg.c:104:6: Test expression for if not boolean, type int: vopt
dms2deg.c:58:8: Variable in declared but not used
  A variable is declared but never used. Use /*@unused@*/ in front of
  declaration to suppress message. (Use -varuse to inhibit warning)

Finished checking --- 7 code warnings
Dang, I gotta clean that thing up.

Hope this helps some.
 
Old 12-26-2015, 10:49 AM   #14
NoStressHQ
Member
 
Registered: Apr 2010
Location: Geneva - Switzerland ( Bordeaux - France / Montreal - QC - Canada)
Distribution: Slackware 14.2 - 32/64bit
Posts: 609

Rep: Reputation: 221Reputation: 221Reputation: 221
Quote:
Originally Posted by tronayne View Post
Because I'm old and been doing this for a long, long time I have a fondness for lint.
I think the OP asks about "symbolic analysis" to "browse" the source and "understand" it... Not the static code analysis to find bugs .
 
Old 12-30-2015, 08:06 AM   #15
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 9,882
Blog Entries: 13

Rep: Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930
Eyeballs and fingers.

The larger a code base is, the more disorganized it is and the more inactive or unused code exists, or the more optional features exist.

Some very serious project, like the Linux kernel, follows a typical organizational pattern.

If there's some huge code base that has no documentation, no general organization, then it's rather difficult and risky to consider using that.

I'm talking about a case where someone comes to me as a developer and suggests that we use some base of code to produce a serious result. The larger the project, the more time expected to be put into it. Starting with some huge base of code that is unproven, undocumented, and not well enough organized to be figured out in a few hours of browsing is not worth anyone's time.

On the other hand, I've purchased source intended for use as part of a solution and in evaluating the product, or engaging it in use, the ones which have gone the best are the exact opposite of the negative things I'm talking about here. They are documented, they have test and validation harnesses, they have comments, and they follow a general organizational method.

Therefore whatever code viewer/organizer works best for you.

I'd also try and build it to see how many errors and warnings there are.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: The challenges of Open edX&#039;s large and complex codebase LXer Syndicated Linux News 0 07-17-2014 10:00 AM
LXer: OpenDaylight Software-Defined Networking Codebase coming together LXer Syndicated Linux News 0 07-25-2013 11:20 AM
LXer: Can Linspire Still Feed on Ubuntu (or Debian) Linux Codebase? LXer Syndicated Linux News 0 07-20-2007 06:33 PM
Cant Compile Mud Codebase Ryan450 Programming 2 12-30-2004 11:41 AM
Legacy codebase vs. rewrite Kurt M. Weber Programming 3 01-30-2004 01:02 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:23 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration