LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 09-08-2006, 02:51 PM   #1
ryedunn
Member
 
Registered: Jul 2003
Location: Chicago
Distribution: Fedora, ubuntu
Posts: 458

Rep: Reputation: 30
Basic Perl - removing white from split field.


Im trying to take a pipe delimited file and remove the white spaces except for particular fields specified. I would like this ARG to accept multiple files but I havnet got that far yet. For some reason its not working.. if someone could please help, it would be most appreciated.

Code:
$SourceFile = $ARGV[0];
$FieldNum = $ARGV[1];

open(INFILE, $SourceFile) or die "Can't open source file: $SourceFile \n";
open(OutGood, "> TEST.good.txt") or die "Can't open output file \n";

while(<INFILE>)
{
    @fields = split /\|/, $_;
    
    if ($FieldNum != $fields){
    	s/\s+//g;
    	}

	print OutGood join '|', @fields;

}
 
Old 09-09-2006, 06:35 AM   #2
makyo
Member
 
Registered: Aug 2006
Location: Saint Paul, MN, USA
Distribution: {Free,Open}BSD, CentOS, Debian, Fedora, Solaris, SuSE
Posts: 718

Rep: Reputation: 72
Hi, neighbor-to-the-south.

It might help if you posted samples: the input data, how you want it to look as transformed, and how your code is mangling it ... cheers, makyo
 
Old 09-09-2006, 07:31 AM   #3
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,695
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
Quote:
Originally Posted by ryedunn
...
@fields = split /\|/, $_;
if ($FieldNum != $fields)
.....
}
without any further information, i suspect you want to check $FieldNum against one of the elements in @fields??? to access the elements, you need to do something like $fields[0] or $fields[1] etc...
 
Old 09-11-2006, 08:43 AM   #4
ryedunn
Member
 
Registered: Jul 2003
Location: Chicago
Distribution: Fedora, ubuntu
Posts: 458

Original Poster
Rep: Reputation: 30
sorry

the input file would be something like

First Name|Date of Birth| City | State | phone
Joe Smith |01 01 1990 |New York|NY|212 555 1212

Im curious how I can remove all white spaces except for the fields specified from the command line ( ARGV[1] ).

so if thats my input file, and I entered
$script.pl 1,3
the output would be
Joe Smith |01011990|New York|NY|2125551212

this is just an example, the actual input file has many many more fields which is why I want to specify which fileds NOT to remove white spaces rather than which fields to remove.

Thank you for your help!
 
Old 09-11-2006, 03:34 PM   #5
makyo
Member
 
Registered: Aug 2006
Location: Saint Paul, MN, USA
Distribution: {Free,Open}BSD, CentOS, Debian, Fedora, Solaris, SuSE
Posts: 718

Rep: Reputation: 72
Hi, ryedunn.

I took your script and changed it a bit. I added some debugging prints so that you can enable them to see intermediate data transformations. To do that, just interchange the debug statements in the code below. I usually write to STDOUT so that I can redirect the output as I chose.

Keep in mind that this is a skeleton of one way to handle the data, and you might need to add to it to suit your situation.

Run the code, figure out what it is doing and if you have questions that are not answered by looking in a perl book, then feel free to ask ... cheers, makyo (73)
#!/usr/bin/perl

Code:
#!/usr/bin/perl

# @(#) p1       Demonstrate perl features.

use warnings;
use strict;
my $debug;
$debug = 1;
$debug = 0;

my $t1 = shift || die "Cannot read fields.\n";
print "Fields read $t1:\n" if $debug;
my @squeezed = split /,/, $t1;
my @fields;

while ( <> ) {
        chomp;
        print "input  :$_:\n" if $debug;
        @fields = split /[|]/;
        print "split ",$#fields+1," from line $.\n" if $debug;
        foreach my $i ( @squeezed ) {
                print "squeezing field $i :$fields[$i]: in line $.\n" if $debug;
                $fields[$i] =~ s/\s+//g;
        }
        print join("|",@fields),"\n";
}
When run on your data line:
Code:
% ./p1 1,4 data1
Joe Smith |01011990|New York|NY|2125551212
( edit | correct omission )

Last edited by makyo; 09-11-2006 at 04:45 PM.
 
Old 09-12-2006, 08:55 AM   #6
ryedunn
Member
 
Registered: Jul 2003
Location: Chicago
Distribution: Fedora, ubuntu
Posts: 458

Original Poster
Rep: Reputation: 30
Wow

Oh wow that great, and its teaching me a lot more about perl in general. this is pretty far over my head but I dot have Learning Perl and The Perl cookbook by O'Reilly to help me sort out some of the commands Im not familiar with.

Just one question, since there are more fields to remove white than not, how hard is it to reverse? (ie 1, 4 will NOT remove white from those fields?)

and how do I use the debug option?

Very very cool stuff.

Last edited by ryedunn; 09-12-2006 at 09:53 AM.
 
Old 09-12-2006, 09:30 AM   #7
makyo
Member
 
Registered: Aug 2006
Location: Saint Paul, MN, USA
Distribution: {Free,Open}BSD, CentOS, Debian, Fedora, Solaris, SuSE
Posts: 718

Rep: Reputation: 72
Hi, ryedunn.

I'm glad you liked it.

Once you have written a number of programs in a language you will probably be able to think in that language, and, for example, your fingers will seem to know to type in a ";" at the end of a perl statement, etc. I believe this is similar to knowing a natural language well enough so that you dream in it. What I call "makyo's limit" for this number is around 100.

To modify the program from changing a field to not changing, I recommend that you think about a way that would consider every field, and if the field is not to be squeezed, then skip over that. That means you need to see how many fields there are and have a loop that looks at every field. Chapters 3 and 10 in Learning Perl, 3rd Edition discuss control structures.

Get the job done, but don't be afraid to experiment by writing little scripts to make sure you understand some points of the language. Remember that this is one way to accomplish the specific task you are addressing, and that there are others ... cheers, makyo
 
Old 09-13-2006, 12:24 AM   #8
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,695
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
An alternative, in Python:

Code:
import sys
try:
	input_options = [ int(i) - 1 for i in sys.argv[1].split(",") ]
except : pass
all = open("input.txt").readlines()
for items in all:
	try:
		getall = items.split("|")
	except: pass	
	for num in input_options:
		try:
			getall[num] = getall[num].replace(" ","")
		except: pass
	print '|'.join(getall)
Output:
Code:
c:\> python test.py 1,2
JoeSmith|01011990|New York|NY|212 555 1212
StanLee|02021903|New Jersey|NJ|213 544 1214|NU sdffsadd
 
  


Reply

Tags
fields, perl, split


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
problems in removing white spaces from string of text monil Programming 7 03-08-2005 11:28 AM
Getting basic usb to work under White Dwarf Linux spencerwaterman Linux - General 1 01-22-2005 01:40 PM
perl input field separator Tinkster Programming 5 10-18-2004 04:08 PM
removing white space accent11 Linux - Software 4 10-06-2004 01:30 AM
Perl Help (possible split) fooforon Programming 2 02-19-2004 05:53 AM


All times are GMT -5. The time now is 03:08 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration