-   Programming (
-   -   Perl regular expressions (

CyberJedi 10-02-2005 11:14 AM

Perl regular expressions
I'm somewhat new to Perl and just started getting into Perl's powerful Regular Expressions. On a tutorial that I'm using (with no solutions to critique my learning), I came across an exercise that asks to write a Perlscript to read a file, number each found line, and find any occurrences of words with double-letters and to put them in parentheses when it finds them.

Here's my script that's supposed to match a double-letter (and number each line upon finding them) but this is all I came up with. Only for matching single letters at a time. I can't get it to match various double-letters in words.

$" = "0";
$file = '/etc/electricity.txt'; # Name the file
open(INFO, $file); # Open the file
@lines = <INFO>; # Read it into an array
foreach $line (@lines)
$_ = $line;
if (s/(pp)/(\1)/g)
print $"."$_\n";

The output should look like this......

01 I was ru(nn)ing in the field and tri(pp)ed over a rock. Thereby, hi(tt)ing my head on the ground.

I used a miscellaneous text file to extract the double-letters from.

Can anyone help? I can't figure it out.

Thanks in advanced.

puffinman 10-02-2005 11:54 AM

I'm not sure if you wanted to number each instance, or just put the line number on each line where you had a double. Anyway, I did the latter. Here is a very short script which does it. I urge you to learn certain idioms, like the while(<>){} loop and the fact that a for loop where you don't specify a variable defaults to $_ (so you can just do for (@lines) and the line will automatically be in $_). I also here used the special variable $. which is the current line number (actually current input record number, which is normally a line). Feed this script the name of the file on the command line.

Learn the features of perl that make it beat the crap out of C for text processing!! Cheers!



while (<>) {
  s/((.)\2)/(\1)/g ? print "$. $_": print;

CyberJedi 10-02-2005 12:17 PM


I understand your use of the regular expression in your script to find double-letters.

Thanks for the help and advice.

All times are GMT -5. The time now is 04:24 PM.