LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-17-2011, 10:52 PM   #1
casperdaghost
Member
 
Registered: Aug 2009
Posts: 349

Rep: Reputation: 16
internal awk variable Field seperator


If I have a file, my file which contains five characters
delimited by both colons and spaces.

One Two:Three:4 Five

If I cat this file and pipe it to this internal awk file -

cat myfile | ./myawkprogram

I get 'Five' when I expected to get the Third field, '4 Five'.

Here is the code.


[CODE]
#!/bin/awk -f
{
if ( $0 ~ /:/ ) {
FS=":";
} else {
FS=" ";
}

print $3
}
[\CODE]
 
Old 10-18-2011, 12:30 AM   #2
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
Have a think of what you are asking here and at what point you are in the code when you ask it.

By the time your awk script has something in $0 it has already performed the split and as you do not set it prior to this it will use the default.
Also, awk can read a file so cat is not required.
 
Old 10-22-2011, 03:05 PM   #3
casperdaghost
Member
 
Registered: Aug 2009
Posts: 349

Original Poster
Rep: Reputation: 16
Yeah when i add a begin statement and set the field separator to space I get a "FIVE" as a result, when i set the field separator to colon i get
a "4 FIVE".

I am not crystal clear on this but I think that the field separation is a parameter best set before the line iteration, not while. i can do this all with bash - i just am onn a awk kick and want to explore the language. thanks.


this is the file : One Two:Three:4 Five

Code:
#!/usr/bin/gawk -f
BEGIN {
        FS=" ";
}
{
if ( $0 ~ /:/ ) {
FS=":";
} else {
FS=" ";
}

print $3
}
 
Old 10-22-2011, 04:30 PM   #4
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,187

Rep: Reputation: 354Reputation: 354Reputation: 354Reputation: 354
FS can be a regular expression: FS=/[ :]/ or FS=/( +)|:/ might be what you want.

<edit>
Or, more generally, FS=/([[:space:]]+)|:/ or FS=/[[:space:]]*[:[:space:]][[:space:]]*/

That last one says "zreo or more white-space characters followed by either a white-space character or a colon, followed be zero of more white-space characters"
</edit>

Last edited by PTrenholme; 10-22-2011 at 04:41 PM.
 
Old 10-22-2011, 10:52 PM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
hmmm ... not quite sure where you were going with this one PT? Neither of your edited versions seem to return any output for the third field.

ahh .. just a did little test ... the issue is that whilst FS is a computed regex it requires quotes (""), although the are turned into slashes (//) at some point.
Any way, even with quotes you are not generating the desired output.

@OP - I believe the best solution is to you use split when you encounter a colon in the line and FS the rest of the time.
 
Old 10-23-2011, 11:12 AM   #6
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,187

Rep: Reputation: 354Reputation: 354Reputation: 354Reputation: 354
Yes, of course, computed regular expressions should be in quoted strings:
Code:
$ echo "One Two:Three:4 Five" | gawk 'BEGIN {FS="[[:space:]]*[:[:space:]][[:space:]]*"} {print;for(i=1;i<=NF;++i) print "  $" i " = " $i}'
One Two:Three:4 Five
  $1 = One
  $2 = Two
  $3 = Three
  $4 = 4
  $5 = Five
<edit>
To show the reason for the "zero or moe" stuff, consider this:
Code:
$ echo "One Two : Three:   4 Five" | gawk 'BEGIN {FS="[[:space:]]*" "[:[:space:]]" "[[:space:]]*"} {print;for(i=1;i<=NF;++i) print "  $" i " = \"" $i "\""}'
One Two : Three:   4 Five
  $1 = "One"
  $2 = "Two"
  $3 = "Three"
  $4 = "4"
  $5 = "Five"
</edit>

Last edited by PTrenholme; 10-23-2011 at 11:21 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
problem while comparing awk field variable with input variable entered using keyboard vinay007 Programming 12 08-23-2011 12:44 AM
Removing specific field seperator to combine two fields Met_girl Linux - Newbie 3 12-02-2010 08:38 AM
[SOLVED] awk: how to print a field when field position is unknown? elfoozo Programming 12 08-18-2010 03:52 AM
[SOLVED] awk: how can I assign value to a shell variable inside awk? quanba Programming 6 03-23-2010 02:18 AM
awk printing from Nth field to last field sebelk Programming 2 01-08-2010 09:39 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 05:48 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration