LinuxQuestions.org
Did you know LQ has a Linux Hardware Compatibility List?
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 04-27-2009, 11:23 PM   #1
hgate73
LQ Newbie
 
Registered: Sep 2008
Posts: 17

Rep: Reputation: 0
How can I find a window of text in a file using Perl?


I'm trying to find a window of text within a file using Perl, but I'm not having much luck figuring it out. I've done it with the regular shell and Grep, but in this situation I must use Perl.

The basic idea is to pass a search term to the script, along with a window size, and have the script return the search term from within the file(s), along with a "window" of text surrounding the data.

I.E.

./windowSearch -w4 filename(s) searchterm

Results would be something like:

preceding line
preceding line
searchterm
subsequent line
subsequent line

This wasn't too hard with the shell, as shown below, but I need to do this with Perl.

This is my original script

Code:
#!/bin/bash
# Title:          windowSearch
# Purpose:        Finds a window of text in a file
# Requirements:   none

# Arguments:
# -w    specify a window size. I.E. -w3
# -f    tell windowSearch that you are searching inside multiple files
#     no argument, tells windowSearch to simply search the file for the string.

RETVAL=0

while test -n "$1"; do
  case "$1" in

  -w|w) # Case to search a single file using a window size
    windowSize=$2; searchTerm=$3; filename=$4
    a=`grep -n "$searchTerm" $filename | cut -d":" -f 1`
    ((b=a+$windowSize))
    ((c=a-$windowSize))
    echo; sed -n ""$c","$b" p " $filename; echo
    exit
      ;;

  -f|f) # Case to search multiple files
    searchTerm=$2;
    echo
    echo "FILE NAME: SEARCH TERM"
    echo "----------------------"
    for var in "$@"
    do
          sed -n "/$searchTerm/s/^/$var: /p" $var 2> /dev/null
    done; echo
    exit
      ;;
   *)  # Default search in a file if no argument was specified
    searchTerm=$1
    filename=$2
    sed -n /$searchTerm/!d $filename
      ;;
  esac
done

exit $RETVAL
Any thoughts or pointers help...I'm kind of lost when it comes to perl.
 
Old 04-27-2009, 11:38 PM   #2
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by hgate73 View Post
I'm trying to find a window of text within a file using Perl, but I'm not having much luck figuring it out. I've done it with the regular shell and Grep, but in this situation I must use Perl.

The basic idea is to pass a search term to the script, along with a window size, and have the script return the search term from within the file(s), along with a "window" of text surrounding the data.

I.E.

./windowSearch -w4 filename(s) searchterm

Results would be something like:

preceding line
preceding line
searchterm
subsequent line
subsequent line

This wasn't too hard with the shell, as shown below, but I need to do this with Perl.

This is my original script

Code:
#!/bin/bash
# Title:          windowSearch
# Purpose:        Finds a window of text in a file
# Requirements:   none

# Arguments:
# -w    specify a window size. I.E. -w3
# -f    tell windowSearch that you are searching inside multiple files
#     no argument, tells windowSearch to simply search the file for the string.

RETVAL=0

while test -n "$1"; do
  case "$1" in

  -w|w) # Case to search a single file using a window size
    windowSize=$2; searchTerm=$3; filename=$4
    a=`grep -n "$searchTerm" $filename | cut -d":" -f 1`
    ((b=a+$windowSize))
    ((c=a-$windowSize))
    echo; sed -n ""$c","$b" p " $filename; echo
    exit
      ;;

  -f|f) # Case to search multiple files
    searchTerm=$2;
    echo
    echo "FILE NAME: SEARCH TERM"
    echo "----------------------"
    for var in "$@"
    do
          sed -n "/$searchTerm/s/^/$var: /p" $var 2> /dev/null
    done; echo
    exit
      ;;
   *)  # Default search in a file if no argument was specified
    searchTerm=$1
    filename=$2
    sed -n /$searchTerm/!d $filename
      ;;
  esac
done

exit $RETVAL
Any thoughts or pointers help...I'm kind of lost when it comes to perl.

First solve a simpler problem - open input file(s) and print its/their contents to STDOUT. For example, start from here:

http://www.perlfect.com/articles/perlfile.shtml
.

Then choosing lines according to some criteria will be simple.

...

And will 'grep' do what you need ?
 
Old 04-28-2009, 02:39 AM   #3
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.5, Centos 5.10
Posts: 16,261

Rep: Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028
Well, here's the perldocs (with examples); specifically open().
http://perldoc.perl.org/functions/open.html
 
Old 04-30-2009, 12:34 AM   #4
hgate73
LQ Newbie
 
Registered: Sep 2008
Posts: 17

Original Poster
Rep: Reputation: 0
Okay, I've made some progress. Here's my script to search a file for a term, but I'm at a loss as how to search for a window of text around the matching line.

Code:
#############
# Main body #
#############

# Test to make sure an argument was passed
if ( not defined $ARGV[0] ) {
	die "\nYou need to enter a search term.\n"; 
} else { # Do the search
	
	$search=$ARGV[0];	# Search term 
	$file=$ARGV[1];		# File to search

	open(FILE, "$file");	# Open the file as "FILE"
	@array=<FILE>;		# Fill the array
	close (FILE);		# Close the file handle

	print "\nSearch Results\n";

	foreach $line (@array){
		if ($line =~ /$search/){
			print "$line";
		}
	}
}
 
Old 04-30-2009, 12:55 AM   #5
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by hgate73 View Post
Okay, I've made some progress. Here's my script to search a file for a term, but I'm at a loss as how to search for a window of text around the matching line.

Code:
#############
# Main body #
#############

# Test to make sure an argument was passed
if ( not defined $ARGV[0] ) {
	die "\nYou need to enter a search term.\n"; 
} else { # Do the search
	
	$search=$ARGV[0];	# Search term 
	$file=$ARGV[1];		# File to search

	open(FILE, "$file");	# Open the file as "FILE"
	@array=<FILE>;		# Fill the array
	close (FILE);		# Close the file handle

	print "\nSearch Results\n";

	foreach $line (@array){
		if ($line =~ /$search/){
			print "$line";
		}
	}
}
What is "window" ? How many lines before and after the line with search item ?

Have you read about Perl 'push' and 'shift functions ? Using the two will allow pretty straightforwardly to have the window if I understand correctly what you mean.

By the way, your script is bad in a sense it is memory hungry - because of

Code:
	@array=<FILE>;		# Fill the array
.

Functionally you do not need that at all.
 
Old 04-30-2009, 01:36 AM   #6
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,695
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
Quote:
Originally Posted by hgate73 View Post
Code:
	foreach $line (@array){
		if ($line =~ /$search/){
			print "$line";
		}
	}
}
change your for loop to loop through a range from 0 to the size of @array. then as the loop goes through each element, search for your line, if found, get that index. Then you can easily get that window you want using that index by subtracting or addition to that index, understand?
 
Old 04-30-2009, 03:57 AM   #7
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by ghostdog74 View Post
change your for loop to loop through a range from 0 to the size of @array. then as the loop goes through each element, search for your line, if found, get that index. Then you can easily get that window you want using that index by subtracting or addition to that index, understand?
ghostdog74, algorithmically your suggestion is correct, but to hold the whole file in array, as I wrote above, is not a good idea.

Furthermore, explicit numeric indexes in Perl programs are rarely needed, and more often than not a solution without them exists - using numeric indexes it's not considered to be good style in Perl.

That's why I'm pushing the OP towards queue/fifo, which for starters can be implemented through an Perl array holding a small line of numbers, using, say, 'push' and 'shift' operations.
 
Old 04-30-2009, 04:55 AM   #8
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,695
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
Quote:
Originally Posted by Sergei Steshenko View Post
ghostdog74, algorithmically your suggestion is correct, but to hold the whole file in array, as I wrote above, is not a good idea.
i agree with you. but i don't think he understands how to do it. I am so tempted to show how, but in order not to spoil your good will, i will just guide him on how to get the next few lines...

@OP, you can use a counter to get next lines after your search pattern
Code:
while(<>){
 if(/search/){
  $count = 4 # eg, get next 4 lines  
 }
 print if $count-- > 0;
}
you can use this to incorporate into your code together with the one to get previous lines...good luck.
 
Old 04-30-2009, 12:13 PM   #9
hgate73
LQ Newbie
 
Registered: Sep 2008
Posts: 17

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by ghostdog74 View Post
i agree with you. but i don't think he understands how to do it. I am so tempted to show how, but in order not to spoil your good will, i will just guide him on how to get the next few lines...

@OP, you can use a counter to get next lines after your search pattern
Code:
while(<>){
 if(/search/){
  $count = 4 # eg, get next 4 lines  
 }
 print if $count-- > 0;
}
Thanks for the replies ghostdog74 and Sergei.

As for Sergei's comment about the array, can I just call the file handle directly? I.E.

Code:
	open(FILE, $file);	
	print "\nSearch Results\n";

	foreach $line (FILE){
		if ($line =~ /$search/){
			print "Your search returned: $line";
		}
	close (NUMBERS);
	}
}

Secondly,

Although I know conceptually what push and shift do (move data off a stack?) I've never used them. I don't understand what the code in the last post (by ghostdog74) is doing either - print if $count --> 0? What is it printing? You'll have to forgive my ignorance - I've barely used Perl at all, and most of my scripting has been basic things.

The "window" I refer to means "find the matching line, then print the nth line above and below the matching line as well."

I.E. if our match was found on line 10, and the window size was 1, the script would print lines 9, 10 and 11.

Last edited by hgate73; 04-30-2009 at 12:14 PM.
 
Old 04-30-2009, 01:30 PM   #10
Telemachos
Member
 
Registered: May 2007
Distribution: Debian
Posts: 754

Rep: Reputation: 59
Quote:
Originally Posted by hgate73
As for Sergei's comment about the array, can I just call the file handle directly? I.E.

Code:
	open(FILE, $file);	
	print "\nSearch Results\n";

	foreach $line (FILE){
No, you can't work on a filehandle that way. (And you don't want to use foreach since it builds up a list before it iterates, and you are trying to avoid building up a list of the whole file at once (in case the file is very large and would gobble up all your available memory)).

Quote:
Originally Posted by hgate73
Although I know conceptually what push and shift do (move data off a stack?) I've never used them.
You should read up on them. They are very basic functions in handling arrays in Perl. (See perldoc -f push and perldoc -f shift.)

Quote:
Originally Posted by hgate73
I don't understand what the code in the last post (by ghostdog74) is doing either - print if $count --> 0? What is it printing?
If you call print without an explicit item you want to print, then Perl defaults to printing whatever is currently in the special variable $_. Since that variable also is the default for reading through a file using the diamond operator (while <FILE>...), you often see people code like this:
Code:
while (<$file_handle>) {
  s/foo/bar/; # = s/foo/bar/ =~ $_ = s/foo/bar/ in the current line
  print;      # = print $_ = print current line in the file
}
This sort of thing is hard to read initially for some people, but it saves you a lot of typing. It's idiomatic Perl, so you will have to get used to seeing it (if you continue to use Perl). Ghostdog is trying to suggest a way to print your n items after the find. (His solution there doesn't cover the n lines before the find.)

Quote:
Originally Posted by hgate73
You'll have to forgive my ignorance - I've barely used Perl at all, and most of my scripting has been basic things.

The "window" I refer to means "find the matching line, then print the nth line above and below the matching line as well."

I.E. if our match was found on line 10, and the window size was 1, the script would print lines 9, 10 and 11.
This is a bit trickier than it may seem, and it seems like a bad idea to do it using a language you don't know at all. In a nutshell, you will need to (1) work through the file line by line, but (2) keep a running array of the last n lines - where n = your window, and then (3) when you find a hit, (4) print the last n lines from the saved array plus the current line, plus the next n lines. If I understand Sergei correctly, he is suggesting push and shift for step 2. You push (add) items onto the end of the array and shift (remove) them from the front.
Code:
my @array = qw/one two three four/;
push @array, 'five'; # @array now = 'one', 'two', 'three', 'four', 'five'
shift @array;        # @array now = 'two', 'three', 'four', 'five'
Since an entire string (ie, a line of a file) can easily be an item of a Perl array, you can store your last n lines using this technique.
 
Old 04-30-2009, 02:13 PM   #11
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by hgate73 View Post
Thanks for the replies ghostdog74 and Sergei.

As for Sergei's comment about the array, can I just call the file handle directly? I.E.

Code:
	open(FILE, $file);	
	print "\nSearch Results\n";

	foreach $line (FILE){
		if ($line =~ /$search/){
			print "Your search returned: $line";
		}
	close (NUMBERS);
	}
}

Secondly,

Although I know conceptually what push and shift do (move data off a stack?) I've never used them. I don't understand what the code in the last post (by ghostdog74) is doing either - print if $count --> 0? What is it printing? You'll have to forgive my ignorance - I've barely used Perl at all, and most of my scripting has been basic things.

The "window" I refer to means "find the matching line, then print the nth line above and below the matching line as well."

I.E. if our match was found on line 10, and the window size was 1, the script would print lines 9, 10 and 11.
In order to "feel" what 'push' and 'shift' do, temporarily forget about matching, and implement the following - for a desired $n (say, 3) read line from file and whenever possible, print last $n lines read uncoditionally, i.e. after reading line #3 prin line numbers 1, 2, 3; after reading line #4 print line numbers 2, 3, 4 and so forth.

When you know how to do this, the solution to your complete problem will be obvious.
 
  


Reply

Tags
perl, search


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Perl replace text in file ShaqDiesel Programming 27 08-11-2010 10:50 PM
How can I attach a text file to the email in perl prakash.akumalla Linux - Newbie 8 10-30-2008 06:41 AM
How to find and change a specific text in a text file by using shell script Bassam Programming 1 07-18-2005 07:15 PM
print text file to the console window, how? l.u Linux - Software 3 06-14-2005 02:54 PM
Parsing large text file with perl smaida Programming 5 09-13-2004 04:33 AM


All times are GMT -5. The time now is 08:47 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration