LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 12-05-2006, 09:55 PM   #1
taylor_venable
Member
 
Registered: Jun 2005
Location: Indiana, USA
Distribution: OpenBSD, Ubuntu
Posts: 892

Rep: Reputation: 43
[Discussion] Ten Lines of Code or Less


A professor today told my class some interesting figures regarding useful code. He claimed that out of all the code that is written by any given person over one day's time, only ten lines or less will actually be correct and part of a delivered solution. Here's the numbers he gave for useful code, depending on application domain:

User / Web / CRUD application: 10 lines
Operating system / utility / library: 2 lines
Realtime / concurrent system: 1 line

I'm no expert in software engineering or whatever field generated these numbers, but they seem really low to me. Even if you consider that there's a lot of really terrible programmers out there who everyday are doing nothing but writing 100% broken code (which I know there are) these averages are still horrible! From intuition alone, I'd like to think that the real numbers, if you could even come up with such, are higher than this. Anyways, I know that I write more than ten good lines of code a day!

So out of curiosity for what seems to me like either an oddity or an error, can anybody else provide some extra insight / evidence / studies that say something on these figures?
 
Old 12-05-2006, 10:06 PM   #2
nadroj
Senior Member
 
Registered: Jan 2005
Location: Canada
Distribution: ubuntu
Posts: 2,539

Rep: Reputation: 60
in college many teachers told me this as well--not your exact figures but basically that the average programmer will write less than 5 lines of code a day. this is taking into account what doesnt work and was removed. it doesnt mean that they come to work, write 5 lines in 10 minutes, and go home.
although these statistics and the ones i was told in college do seem surprisingly low, i would say i believe it.
 
Old 12-05-2006, 10:11 PM   #3
tuxdev
Senior Member
 
Registered: Jul 2005
Distribution: Slackware
Posts: 2,012

Rep: Reputation: 115Reputation: 115
That number (as you quoted it) sounds more like a mode, not a mean. By itself, that number doesn't say anything at all. Lines of code in what? 10 lines CL or 10 lines Assembly? In maintenance mode or creation mode? Isn't it well known that LOC was a mostly useless metric, like BogoMips?
 
Old 12-05-2006, 10:14 PM   #4
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 65
Sounds pretty dismal to me, but it might be average under some conditions like heavily reviewed code with expensive (in terms of time and effort) change tracking and approval procedures.

I can't be bothered to find a link now, but I was shocked earlier this year when there was a story (I think it was slashdotted) which gave some statistic about Microsoft programmers. I think the stat was lines per year per dev or something similar an I worked out that taking account of holiday days and so on, these guys were turning about something like 15 lines of code a day.

I was utterly flabbergasted at this. I personally wrote about 150,000 of code in the space of two years, and I wasn't coding all day because I was doing a production support job at the same time. Now I'm sure there were plenty of mistakes and mis-implementations in that lot, but it worked, and AFAIK still works more or less unaltered some 5 years later.

Now this isn't realtime code I'm talking about by any stretch of the imagination - much of it was rather dull code doing menial data extraction tasks (probably falls under this "CRUD" label, whatever that means, and mostly in Perl), but even so it shows that it's possible to churn out a whole lot more code than that in a day.

I think one of the main variables is that what I was writing was done almost totally on my own, and with no externally enforced approval and auditing procedures. Revision control was with RCS (just because it was available).

In short, I think this lines of code metric is only useful when comparing development in like environments - with similar code review procedures and number of people working on a given task.
 
Old 12-05-2006, 10:18 PM   #5
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 65
Quote:
Originally Posted by tuxdev
That number (as you quoted it) sounds more like a mode, not a mean. By itself, that number doesn't say anything at all. Lines of code in what? 10 lines CL or 10 lines Assembly? In maintenance mode or creation mode? Isn't it well known that LOC was a mostly useless metric, like BogoMips?
Coming from an academic, I expect he/she was thinking of lines of MIX, Ada, Miranda and Prolog. I spent a very long time putting out just a few lines of Miranda because I spent most of the time wondering what possible use this stupid language was in the real world. The answer so far, in the 10 years or so since leaving uni is: none at all.
 
Old 12-06-2006, 06:57 PM   #6
graemef
Senior Member
 
Registered: Nov 2005
Location: Hanoi
Distribution: Fedora 13, Ubuntu 10.04
Posts: 2,379

Rep: Reputation: 148Reputation: 148
Quite how your professor came to the figures I'm not certain and I'm not going to dispute them, I suspect it is from a limited number of companies that work in large scale environment (not a single user churning out code, remember your course is Software Engineering). I want to look at the line of code metric, however before I get onto that I will touch briefly on the point of what happens to all those lines of code in the calculations. First a programmer will write a lot of code, but in the whole equation a programmer is not standing alone (again considering programming in the large) The system will have project managers, system architects, testers, reviewers, technical support, personal assistants, telephone operators, cleaners, sales force etc. who are all employed by the organisation and yet they do not contribute to a single line of code, some will even reduce the number of lines of code but their job (in different ways) is to improve the quality of the product. Then of course an organisation will need to factor in sick days and holidays for each member of staff...

Now what is a line of code (LOC)?

Consider the following function.
PHP Code:
1./** @function    acceptUserDetails()
2. *  @description function that will check and modify the user details 
3. */
4.
5. 
function acceptUserDetails($clean$style='')
6.{
7.   // Initialise the local variables
8.   $userName $clean['userName'];
9.   $dataChanged false;
10.   $passwordChanged false;
11.   $rightsChanged false;
12.   
13.   
// Has a new password been entered?
14.   if ($clean['logPassword1']!='' || $clean['logPassword2']!='')
15.   {
16.      $passwordChanged true;
17.      $dataChanged true;
18.   }
19.   
20.   
// Has the authorisation level been changed
21.   if ($clean['su']!=UserRights($userName))
22.   {
23.      $rightsChanged true;
24.      $dataChanged true;
25.   }
26.   
27.   
if ($dataChanged)
28.   {
29.      if ($passwordChanged)
30.      {
31.         // update the password file
32.         if (!updatePasswords($clean['logName']
33.                             ,$clean['logPassword1']
34.                             ,$clean['logPassword2']
35.                             )
36.            )
37.         {
38.            return changeUserDetails($userName$clean$style);
39.         // end if Password error 
40.      // end if password has changed
41.      if ($rightsChanged)
42.      {
43.         // update the authorisation file
44.         if (!updateUserRights($userName$clean['su']))
45.         {
46.            return changeUserDetails($userName$clean$style);
47.         // end if user rights error
48.      // end if user rights have changed
49.   // end if some data has changed
50.   
51.   
return true;
52.// end function acceptUserDetails() 
The function consists of 52 lines but that figure would rarely be considered as a valid value for the lines of code, of those 52 lines there are 5 blank lines, there are 8 line that are just comments, there are 2 lines that just have a close parenthesis, and there are 16 lines that essentially consist of just open or close brace. None of these (31 lines) would be considered as LOC and yet there are there to help the readability of the code. That reduces the 52 line down to 21 lines.

Now some views of LOC will:
Ignore the function signature lines, in this case 1 line (line 5)
Ignore initialisation statements, in this case 4 lines
Ignore control statements (such as if(), while(), for() etc.), in this case 9 lines (lines 14, 21, 27 29,32-34,41,44)
Ignore return statements, in this case 3 lines (lines 38, 46, 51)
Ignore function calls, in this case 0 lines (since they have already been discounted)

That leaves me with 4 lines of code. Namely lines 16-17 and 23-24. Obviously some people will measure the LOC of this function differently, for example if the control statements are added then (there are 7 if statements) the LOC jumps to 11, add function calls on lines not already counted then there will be an increase of 2 LOC to 13. But as a metric the LOC is both useful and meaningless, it is useful as a rough gauge to productivity but it is so easy to inflate. If we decide that the function above is 4LOC, Then I could argue that I have inflated the value by 100%. I have done that by introducing the local variable $dataChanged. I initialise it once, set it twice and use it once, but on the line that I use it 27 could be replaced with the following

PHP Code:
1.if ($passwordChanged || rightsChanged
So what is the value of the LOC? (As a former lecturer in Comp Sci I'd suggest that the LOC metric allows Software Engineering professors have an impressive slide in their lecture, but not much else )

Function points tend to be a more useful metric and I'm sure that you will soon learn all about them, enjoy!
 
Old 12-06-2006, 07:16 PM   #7
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 65
The idea that if ans while conditions should not be counted as lines of code is somewhat suspect. This is program logic. It generates machine code when compiled. It's code to my mind. Having said that I'm speaking from a purely personal point of view - I don't have any formal code auditing experience.

Overall though, I agree that LOC is a very rough metric. But then metrics are often only possible when you use rules of thumb.
 
Old 12-06-2006, 09:03 PM   #8
graemef
Senior Member
 
Registered: Nov 2005
Location: Hanoi
Distribution: Fedora 13, Ubuntu 10.04
Posts: 2,379

Rep: Reputation: 148Reputation: 148
First I did say some views of LOC... The reason is as follows:

if($num >= 1)
if($num <= 9)

equates to two LOC, whereas:

if ($num >= 1 && $num <= 9)

equates to one LOC. But it is what is done if the statement equates to true that is important. Otherwise I could empty if statement just to bloat the LOC count. And production code does have empty if statement not because of any desire to bloat the LOC count but because over time the code that was executed has been commented out.

However, I agree that ignoring control statements is debatable. However they are certainly used in other metrics to calculate the complexity of the function. (I'm not a SE expert so I forget their name)
 
Old 12-07-2006, 09:05 PM   #9
taylor_venable
Member
 
Registered: Jun 2005
Location: Indiana, USA
Distribution: OpenBSD, Ubuntu
Posts: 892

Original Poster
Rep: Reputation: 43
Nice demonstration, graemef; I agree that LoC is a fuzzy (at best) way to measure productivity. Even still, taking the most strict view of four lines of code, that code doesn't really do a whole lot. Assuming an eight-hour workday, should your example take a programmer three hours and twelve minutes to write? Furthermore, this looks like very basic logic, so the ten LoC per day estimate is most applicable, but it also is the most optimistic estimate. The claims are saying that if you write one good line of OS-level code, it effectively took you four hours. From a business perspective alone, I'd be pretty upset if my team-mates / employees were writing only one good line of code every four hours. Even if it is an average over the entire population, all the incompetent programmers included, it's still a pretty dismal assessment. Maybe it is true, and maybe that's just the sad state that the software industry is in today.

With regards to languages, that's another hole in the use of LoC as a metric for productivity. Compare the number of lines of code required to construct a list of like-typed numeric values satisfying an arbitrary arithmetic constraint in C++ and in Haskell. Plus some languages and applications (Lisp, Ruby on Rails) are particularly known for their ability to extract a lot of use from highly compressed code. The professor didn't mention language when he was talking about this, but I assume that since he claimed it was a global average it was heavy in Java, C++, C, COBOL, Fortran, etc.

Looking at things from a more flexible perspective (one open to more than just the mainstream languages), and judging from other inconsistencies in the use of LoC for a productivity metric (e.g. what exactly is a "line"), these numbers might not be inaccurate per se, but instead completely irrelevant, i.e. they don't yield any useful information. If you arrive at that conclusion, you have to admit that such figures, when given, are pretty misleading.
 
Old 12-07-2006, 09:57 PM   #10
frob23
Senior Member
 
Registered: Jan 2004
Location: Roughly 29.467N / 81.206W
Distribution: OpenBSD, Debian, FreeBSD
Posts: 1,450

Rep: Reputation: 48
On a new program (for personal use), I average about 50-200 lines of code a day (2-8 hours, usually 4) for the first week depending on the size of the program. The larger programs will typically be higher only if I have a very clear outline for how I want it done. These numbers exponentially taper off as time goes on and the program gets more fleshed out.

As the program matures and grows larger, the amount of new lines being produced drops quickly once the basics have been laid down. Many days I'll see a negative count. For example, I had a function which was in the middle of a stats loop (which could be executed many millions of times for one run of the program) and I spent about an hour looking at different ways of coding that function so that it would be both easy to understand/maintain and produce efficient code. In the end I saved about half a second of real processing time (for each 100,000,000 loops on this hardware) -- which is nothing really -- and was left with 4 less lines of code than I started with. Since this was all the time I devoted to the code today it would seem to be counter productive (it was) but that's how programming can be sometimes.

And then there are the days spent hunting down elusive bugs which must be squashed. You can spend 100 hours tracking down a dozen bugs and adding less than 20 lines of code to fix them all. And you're now well under that five lines of code a day. And many coders spend the majority of their time debugging code, not creating new code.

I would say that amateur (and that's not meant in a bad way) programmers tend to produce more code than professional programmers simply because they're working under different conditions. Quick hacks which work well enough to get the job done and then just fade away can add many lines to a programmers day... but extended products which must ship out the door, after extensive debugging and review... with team members often undoing your work in favor of something else and you doing them same... well your lines per day are abysmal at times.

Last edited by frob23; 12-07-2006 at 10:00 PM.
 
Old 12-09-2006, 12:04 AM   #11
graemef
Senior Member
 
Registered: Nov 2005
Location: Hanoi
Distribution: Fedora 13, Ubuntu 10.04
Posts: 2,379

Rep: Reputation: 148Reputation: 148
Frob23 has made an important observation regarding the maintenance of code. For a moderate system you might at the peek have 30 programmers on it once it is in production you will need to have someone to support that system, they might be spending their time on other systems as well but their LOC will be minimal and possible negative. So that starts to reduce the LOC count.

Then you might decide to rewrite part of the system, say from a C to a C++ core. At the end of the day you might have a more robust system with easier maintenance and a greater potential to enhance the system but the functionality is the same, same system different language and possible less lines of code, but that rewrite might have taken several man years of effort.

So, yes it all comes back to does the LOC metric have any value? And a rule I live by is when any statistic is banded about treat it with a pinch of salt, it adds flavour (in this case to the lecture) but doesn't add and substance.

graeme.
 
Old 12-09-2006, 06:06 PM   #12
NDR008
Member
 
Registered: Nov 2006
Location: (Bristol or Coventry) (UK) or Malta
Distribution: openSUSE 11.0
Posts: 173

Rep: Reputation: 30
I would think in a programming language like BASIC it is easier to give meaning to LOC.
 
Old 12-10-2006, 08:50 PM   #13
graemef
Senior Member
 
Registered: Nov 2005
Location: Hanoi
Distribution: Fedora 13, Ubuntu 10.04
Posts: 2,379

Rep: Reputation: 148Reputation: 148
It's not so much counting the lines of code it is agreeing on what LOC means. Most languages a script can be run to extract the number of lines. Once a definition of LOC has been decided then what is its value. Personally I feel it has less value than a chocolate teapot.
 
Old 12-10-2006, 10:31 PM   #14
dmail
Member
 
Registered: Oct 2005
Posts: 970

Rep: Reputation: Disabled
Quote:
Originally Posted by graemef
Personally I feel it has less value than a chocolate teapot.
Couldn't agree more.

Quote:
It's not so much counting the lines of code it is agreeing on what LOC means.
Yes and no, some of the best lines of code I have wrote only consisted of a few lines and they were C++ templates, how can they be measured?
 
Old 12-10-2006, 11:52 PM   #15
indienick
Senior Member
 
Registered: Dec 2005
Location: London, ON, Canada
Distribution: Arch, Ubuntu, Slackware, OpenBSD, FreeBSD
Posts: 1,853

Rep: Reputation: 65
This is a pretty intense thread, I've gotta say - makes me feel smarter just by reading it.

I agree with NDR008 that BASIC is a language to which counting the LOC one by one (with no guidelines specifying what a "line of code" is) could be beneficial. I say this because in BASIC, while you can squeeze more than one statement per line, it's usually frowned upon from a BASIC programming standpoint. The antithesis to this is Lisp, where, while you can spread code about on different lines, you can just as easily combine lists of code in one line (allowing line wrap is disabled), but the readability gets shot to pieces with a violent malice, and whether you combine all the code on a single line, or spread it out over several lines, isn't judged in the least - either is acceptable.

LOC metrics are a joke. Software companies can implement all the LOC metrics they want, but the real measure should somehow incorporate the functionality, both imminent and possible, of the written code. Any LOC metric should dismiss "flowery" code - code that just fancifies the program's output - and comments (obviously), and focuses on the functionality of the code, which would cause the at-the-time-of-writing importance value of the code to fluxuate constantly.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
DISCUSSION: Keeping Lines Open jeremy LinuxAnswers Discussion 1 08-14-2006 01:33 PM
adding code lines Qwo Linux From Scratch 17 05-24-2006 02:58 AM
LXer: Ten BY TEN. India's IT & BPO Outsourcing. LXer Syndicated Linux News 0 12-26-2005 03:31 PM
How to count how many lines of code my project is? The_Nerd Programming 4 08-30-2004 08:26 PM
Add lines of code to files from command line robeb Linux - General 2 06-12-2002 03:54 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:56 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration