-   Programming (
-   -   Basic Perl - removing white from split field. (

ryedunn 09-08-2006 03:51 PM

Basic Perl - removing white from split field.
Im trying to take a pipe delimited file and remove the white spaces except for particular fields specified. I would like this ARG to accept multiple files but I havnet got that far yet. For some reason its not working.. if someone could please help, it would be most appreciated.


$SourceFile = $ARGV[0];
$FieldNum = $ARGV[1];

open(INFILE, $SourceFile) or die "Can't open source file: $SourceFile \n";
open(OutGood, "> TEST.good.txt") or die "Can't open output file \n";

    @fields = split /\|/, $_;
    if ($FieldNum != $fields){

        print OutGood join '|', @fields;


makyo 09-09-2006 07:35 AM

Hi, neighbor-to-the-south.

It might help if you posted samples: the input data, how you want it to look as transformed, and how your code is mangling it ... cheers, makyo

ghostdog74 09-09-2006 08:31 AM


Originally Posted by ryedunn
@fields = split /\|/, $_;
if ($FieldNum != $fields)

without any further information, i suspect you want to check $FieldNum against one of the elements in @fields??? to access the elements, you need to do something like $fields[0] or $fields[1] etc...

ryedunn 09-11-2006 09:43 AM

the input file would be something like

First Name|Date of Birth| City | State | phone
Joe Smith |01 01 1990 |New York|NY|212 555 1212

Im curious how I can remove all white spaces except for the fields specified from the command line ( ARGV[1] ).

so if thats my input file, and I entered
$ 1,3
the output would be
Joe Smith |01011990|New York|NY|2125551212

this is just an example, the actual input file has many many more fields which is why I want to specify which fileds NOT to remove white spaces rather than which fields to remove.

Thank you for your help!

makyo 09-11-2006 04:34 PM

Hi, ryedunn.

I took your script and changed it a bit. I added some debugging prints so that you can enable them to see intermediate data transformations. To do that, just interchange the debug statements in the code below. I usually write to STDOUT so that I can redirect the output as I chose.

Keep in mind that this is a skeleton of one way to handle the data, and you might need to add to it to suit your situation.

Run the code, figure out what it is doing and if you have questions that are not answered by looking in a perl book, then feel free to ask ... cheers, makyo (73)



# @(#) p1      Demonstrate perl features.

use warnings;
use strict;
my $debug;
$debug = 1;
$debug = 0;

my $t1 = shift || die "Cannot read fields.\n";
print "Fields read $t1:\n" if $debug;
my @squeezed = split /,/, $t1;
my @fields;

while ( <> ) {
        print "input  :$_:\n" if $debug;
        @fields = split /[|]/;
        print "split ",$#fields+1," from line $.\n" if $debug;
        foreach my $i ( @squeezed ) {
                print "squeezing field $i :$fields[$i]: in line $.\n" if $debug;
                $fields[$i] =~ s/\s+//g;
        print join("|",@fields),"\n";

When run on your data line:

% ./p1 1,4 data1
Joe Smith |01011990|New York|NY|2125551212

( edit | correct omission )

ryedunn 09-12-2006 09:55 AM

Oh wow that great, and its teaching me a lot more about perl in general. this is pretty far over my head but I dot have Learning Perl and The Perl cookbook by O'Reilly to help me sort out some of the commands Im not familiar with.

Just one question, since there are more fields to remove white than not, how hard is it to reverse? (ie 1, 4 will NOT remove white from those fields?)

and how do I use the debug option?

Very very cool stuff.

makyo 09-12-2006 10:30 AM

Hi, ryedunn.

I'm glad you liked it.

Once you have written a number of programs in a language you will probably be able to think in that language, and, for example, your fingers will seem to know to type in a ";" at the end of a perl statement, etc. I believe this is similar to knowing a natural language well enough so that you dream in it. What I call "makyo's limit" for this number is around 100.

To modify the program from changing a field to not changing, I recommend that you think about a way that would consider every field, and if the field is not to be squeezed, then skip over that. That means you need to see how many fields there are and have a loop that looks at every field. Chapters 3 and 10 in Learning Perl, 3rd Edition discuss control structures.

Get the job done, but don't be afraid to experiment by writing little scripts to make sure you understand some points of the language. Remember that this is one way to accomplish the specific task you are addressing, and that there are others ... cheers, makyo

ghostdog74 09-13-2006 01:24 AM

An alternative, in Python:


import sys
        input_options = [ int(i) - 1 for i in sys.argv[1].split(",") ]
except : pass
all = open("input.txt").readlines()
for items in all:
                getall = items.split("|")
        except: pass       
        for num in input_options:
                        getall[num] = getall[num].replace(" ","")
                except: pass
        print '|'.join(getall)


c:\> python 1,2
JoeSmith|01011990|New York|NY|212 555 1212
StanLee|02021903|New Jersey|NJ|213 544 1214|NU sdffsadd

All times are GMT -5. The time now is 04:31 PM.