LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 06-17-2011, 10:47 AM   #1
Lost_Oracle
LQ Newbie
 
Registered: Jul 2010
Distribution: Red Hat Commercial
Posts: 16

Rep: Reputation: 2
Post Perl - Making this code more efficient


Hello everyone,

I am fairly new to perl, but have a fairly strong background in C++. So, my perl usually looks like C++ when I'm done with it. This program works just fine, I was just wondering if anyone had any tips or ideas for making it more compact or efficient.

The following code goes through a text file (entries separated by spaces) line by line, that looks like this.

Code:
Number of Files    IDnumber    TYPE
2                  001001      
1                  001001      type1
1                  001001      type2
2                  001001      type3
3                  001002      type2
4                  001002      type3
.
.
etc.
A file ID may appear up to four times, once with each of the three types (denoted by literal strings "type1" "type2" etc )and possibly once with no type. The text file is ordered by IDnumber, so all 001001 files will appear consecutively in the text file.

The goal is to read in all of this information and output it like so:

Code:
IDnumber type1 type2 type3 total
001001   1     1     2      4
Percent  25    25    50     100
.
.
Here is the current code:

Code:
#!/usr/bin/perl

#Open input and output files
open IN, "input.txt" or die $!;
open ( OUT, '>>output.txt');
#print headings
print OUT "IDnumber, type1, type2, type3, total\n";
#skip first line of file
$line = <FILE>;
#read in first line
$line = <FILE>;
@data = split (" ", $line );
$type1_count = 0;
$type2_count = 0;
$type3_count = 0;
$notype_count = 0;
$old_id = 0;
$old_num = 0;
$old_type = "";
$new_num = @data[0];
$new_id = @data[1];
$new_type = @data[2];
if ( $new_type =~ "type2")
{
	$type2_count = $new_num;
}
elsif ( $new_type =~ "type1")
{
	$type1_count = $new_num;
}
elsif ( $new_type =~ "type3" )
{
	$type3_count = $new_num;
}
else
{
	$notype_count = $new_num;
}

while ( $line = <FILE>)
{
	$old_num = $new_num;
	$old_id = $new_id;
	$old_type = $new_type;
	@data = split(" ", $line);
	$new_num = @data[0];
	$new_id = @data[1];
	$new_type = @data[2];

	if ( $new_id!=$old_id )
	{
		$sum = $type1_count + $type2_count + $type3_count;
		if ( $type1_count == 0 )
		{
			$type1_Per = 0;
		}
		else
		{
			$type1_Per = $type1_count/$sum*100;
		}
		if ( $type2_count == 0 )
		{
			$type2_Per = 0;
		}
		else
		{
			$type2_Per =  $type2_count/$sum*100;
		}
		if ( $type3_count == 0 )
		{
			$type3_Per = 0;
		}
		else
		{
			$type3_Per = $type3_count/$sum*100;
		}
		$Per_sum= $type1_Per+$type2_Per+$type3_Per;
		if ( $sum != 0 )
		{
			print OUT "$old_id, $type3_count, $type1_count, $type2_count, $sum\n";
			#print OUT "% Count, $type3_Per%, $type1_Per%, $type2_Per%, $Per_sum%\n";
		}	
		$type1_count = 0;
		$type2_count = 0;
		$type3_count = 0;
		$other_count = 0;
		$sum = 0;
	}

	if ( $new_type =~ "type2")
	{
       		$type1_count = $new_num;
	}
	elsif ( $new_type =~ "type1")
	{
        	$type1_count = $new_num;
	}
	elsif ( $new_type =~ "type3" )
	{
       		$type3_count = $new_num;
	}
	else
	{
        	$other_count = $new_num;
	}
}
        $old_num = $new_num;
        $old_id = $new_id;
        $old_type = $new_type;
        $sum = $type1_count + $type2_count + $type3_count ;

	if ( $type1_count == 0 )
        {
                $type1_Per = 0;
        }
        else
        {
                $type1_Per = $type1_count/$sum*100;
        }
        if ( $type2_count == 0 )
        {
                $type2_Per = 0;
        }
        else
        {
                $type2_Per =  $type2_count/$sum*100;
        }
        if ( $type3_count == 0 )
        {
                $type3_Per = 0;
        }
        else
        {
                $type3_Per = $type3_count/$sum*100;
        }
        $Per_sum= $L1G_Per+$L1Gt_Per+$L1T_Per;

        if ( $sum != 0 )
	{
		print OUT "$old_id, $type1_count, $type2_count, $type3_count, $sum\n";
		print OUT "% Count, $type1_Per%, $type2_Per%, $type3_Per%, $Per_sum%\n";
	}
 
Old 06-17-2011, 11:50 AM   #2
markush
Senior Member
 
Registered: Apr 2007
Location: Germany
Distribution: Slackware
Posts: 3,979

Rep: Reputation: Disabled
You should read about hashes in Perl, here "hash of hashes". IDnumber is a Hash and every IDnumber is a hash with entries type1, type2 and so on.

For the input
Code:
open IN, "input.txt" or die $!;
while (<IN>) {
         ($number, $idnumber, $type) = split(" ", $_);
            $id_hash{$idnumber}{$type} += $number;
            # and so on
}
not tested!,

Markus

Last edited by markush; 06-17-2011 at 12:18 PM. Reason: added some text
 
1 members found this post helpful.
Old 06-17-2011, 12:52 PM   #3
Lost_Oracle
LQ Newbie
 
Registered: Jul 2010
Distribution: Red Hat Commercial
Posts: 16

Original Poster
Rep: Reputation: 2
Thanks for the tip markush. I had thought about trying to implement a hash, but I didn't think about a hash of hashes. Also, I didn't know you could use that form of syntax for the split function. That should help compress the code a bit.

I'm still open to more suggestions, so I won't mark this thread as closed just yet. Also, does anyone have comments performance wise on hashes of hashes vs. arrays the way I originally implemented them? I'm going to switch to hashes unless someone comes up with something even better, but I'm just curious what people have to say on the matter.
 
Old 06-17-2011, 04:03 PM   #4
markush
Senior Member
 
Registered: Apr 2007
Location: Germany
Distribution: Slackware
Posts: 3,979

Rep: Reputation: Disabled
Quote:
Originally Posted by Lost_Oracle View Post
...Also, does anyone have comments performance wise on hashes of hashes vs. arrays the way I originally implemented them? ...
Hashes in Perl are more efficient than arrays. In general: regular expressions and hashes yield very fast Perlcode and you should prefer these if possible.
Also Perls motto is TMTOWTDI (there's more than one way to do it), there is no "best" solution in most cases. But perlcode can be very short and often the shorter code is the more efficient one. The downside of short code is the readability. I use Perl since Version 4, but haven't done very much Perlprogramming, when I read code which I've written some month ago, I often experience difficulties to understand what I wrote. This is the reason why I do more coding with Ruby. Otherwise Perl is a great language, and there are many people around who wrote brilliant perlcode, it is a very nice community.
I'd recommend to read the Camel-Book, it is very well written and gives a deep insight not only into Perl, but also into Unix/Linux in general: http://oreilly.com/catalog/9780596000271

Markus
 
  


Reply

Tags
efficiency, perl, style



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Error in Perl Code : Bad switch statement(Problem in code block)? near ## line # suyog255 Programming 4 02-20-2008 05:35 PM
making networking more efficient entz Programming 7 10-02-2007 05:39 AM
Hiding code in PERL, perl gui question randomx Programming 1 06-26-2004 03:22 PM
making an entry in crontab through code (perl) akaash Programming 2 05-17-2004 03:36 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 06:30 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration