LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-17-2011, 11:52 PM   #1
casperdaghost
Member
 
Registered: Aug 2009
Posts: 349

Rep: Reputation: 16
internal awk variable Field seperator


If I have a file, my file which contains five characters
delimited by both colons and spaces.

One Two:Three:4 Five

If I cat this file and pipe it to this internal awk file -

cat myfile | ./myawkprogram

I get 'Five' when I expected to get the Third field, '4 Five'.

Here is the code.


[CODE]
#!/bin/awk -f
{
if ( $0 ~ /:/ ) {
FS=":";
} else {
FS=" ";
}

print $3
}
[\CODE]
 
Old 10-18-2011, 01:30 AM   #2
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,256

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
Have a think of what you are asking here and at what point you are in the code when you ask it.

By the time your awk script has something in $0 it has already performed the split and as you do not set it prior to this it will use the default.
Also, awk can read a file so cat is not required.
 
Old 10-22-2011, 04:05 PM   #3
casperdaghost
Member
 
Registered: Aug 2009
Posts: 349

Original Poster
Rep: Reputation: 16
Yeah when i add a begin statement and set the field separator to space I get a "FIVE" as a result, when i set the field separator to colon i get
a "4 FIVE".

I am not crystal clear on this but I think that the field separation is a parameter best set before the line iteration, not while. i can do this all with bash - i just am onn a awk kick and want to explore the language. thanks.


this is the file : One Two:Three:4 Five

Code:
#!/usr/bin/gawk -f
BEGIN {
        FS=" ";
}
{
if ( $0 ~ /:/ ) {
FS=":";
} else {
FS=" ";
}

print $3
}
 
Old 10-22-2011, 05:30 PM   #4
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,186

Rep: Reputation: 346Reputation: 346Reputation: 346Reputation: 346
FS can be a regular expression: FS=/[ :]/ or FS=/( +)|:/ might be what you want.

<edit>
Or, more generally, FS=/([[:space:]]+)|:/ or FS=/[[:space:]]*[:[:space:]][[:space:]]*/

That last one says "zreo or more white-space characters followed by either a white-space character or a colon, followed be zero of more white-space characters"
</edit>

Last edited by PTrenholme; 10-22-2011 at 05:41 PM.
 
Old 10-22-2011, 11:52 PM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,256

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
hmmm ... not quite sure where you were going with this one PT? Neither of your edited versions seem to return any output for the third field.

ahh .. just a did little test ... the issue is that whilst FS is a computed regex it requires quotes (""), although the are turned into slashes (//) at some point.
Any way, even with quotes you are not generating the desired output.

@OP - I believe the best solution is to you use split when you encounter a colon in the line and FS the rest of the time.
 
Old 10-23-2011, 12:12 PM   #6
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,186

Rep: Reputation: 346Reputation: 346Reputation: 346Reputation: 346
Yes, of course, computed regular expressions should be in quoted strings:
Code:
$ echo "One Two:Three:4 Five" | gawk 'BEGIN {FS="[[:space:]]*[:[:space:]][[:space:]]*"} {print;for(i=1;i<=NF;++i) print "  $" i " = " $i}'
One Two:Three:4 Five
  $1 = One
  $2 = Two
  $3 = Three
  $4 = 4
  $5 = Five
<edit>
To show the reason for the "zero or moe" stuff, consider this:
Code:
$ echo "One Two : Three:   4 Five" | gawk 'BEGIN {FS="[[:space:]]*" "[:[:space:]]" "[[:space:]]*"} {print;for(i=1;i<=NF;++i) print "  $" i " = \"" $i "\""}'
One Two : Three:   4 Five
  $1 = "One"
  $2 = "Two"
  $3 = "Three"
  $4 = "4"
  $5 = "Five"
</edit>

Last edited by PTrenholme; 10-23-2011 at 12:21 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
problem while comparing awk field variable with input variable entered using keyboard vinay007 Programming 12 08-23-2011 01:44 AM
Removing specific field seperator to combine two fields Met_girl Linux - Newbie 3 12-02-2010 09:38 AM
[SOLVED] awk: how to print a field when field position is unknown? elfoozo Programming 12 08-18-2010 04:52 AM
[SOLVED] awk: how can I assign value to a shell variable inside awk? quanba Programming 6 03-23-2010 03:18 AM
awk printing from Nth field to last field sebelk Programming 2 01-08-2010 10:39 AM


All times are GMT -5. The time now is 11:05 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration