LinuxQuestions.org
Go Job Hunting at the LQ Job Marketplace
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 05-10-2010, 04:19 PM   #1
sebelk
Member
 
Registered: Jan 2007
Posts: 66

Rep: Reputation: 15
awk: swapping fields and records and for loop


Hi,

I have a file (let's say usersfile) with a structure like as follows:

jdoe
jdoe@myorg.com
jdoe@somemail.com

msmith
msmith@myorg.com

yworld
yworld@myorg.com
yw100@anothermail.biz
EOF

I want to get something like that:

jdoe jdoe@myorg.com jdoe@somemail.com
msmith msmith@myorg.com
yworld yworld@myorg.com yw100@anothermail.biz

I did this:

Code:
awk 'BEGIN {FS="\n";RS="\n\n";}  {  print $1"\t"$2"\t"$3} ' usersfile
But it's not so efficient because I don't know how many mail accounts a user has beforehand, so I've tried:

Code:
 awk 'BEGIN {FS="\n";RS="\n\n";}  { for (i = 1; i <= NF; i++)    print $i "\t"} ' usersfile
But I've got:

jdoe
jdoe@myorg.com
jdoe@somemail.com
msmith
msmith@myorg.com
yworld
yworld@myorg.com
yw100@anothermail.biz

Please could you help me to fix the script?

Thanks in advance!!
 
Old 05-10-2010, 04:32 PM   #2
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,508

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
You can try printf to format the output as desired, e.g.
Code:
awk 'BEGIN { FS="\n"; RS="\n\n" } { printf "%s",$1; for (i = 2; i <= NF; i++) printf "\t%s",$i; printf "\n" }' usersfile
or even more simply force the rebuild of the entire record, so that OFS is used when printing:
Code:
awk 'BEGIN { FS="\n"; RS="\n\n" } $1=$1' usersfile
Note: in this case set OFS="\t" in the BEGIN section if you prefer TAB as output separator.

Last edited by colucix; 05-10-2010 at 05:00 PM. Reason: Added simplified code
 
Old 05-10-2010, 06:28 PM   #3
sebelk
Member
 
Registered: Jan 2007
Posts: 66

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by colucix View Post
You can try printf to format the output as desired, e.g.
Code:
awk 'BEGIN { FS="\n"; RS="\n\n" } { printf "%s",$1; for (i = 2; i <= NF; i++) printf "\t%s",$i; printf "\n" }' usersfile
or even more simply force the rebuild of the entire record, so that OFS is used when printing:
Code:
awk 'BEGIN { FS="\n"; RS="\n\n" } $1=$1' usersfile
Note: in this case set OFS="\t" in the BEGIN section if you prefer TAB as output separator.
Thanks although, sorry my neurons are almost dead at this time, I don't understand why your last example works

Also, I've tried

Code:
 awk 'BEGIN {OFS="\t";FS="\n";RS="\n\n";}  {   print $0;} ' e2
But it outputs

jdoe
jdoe@myorg.com
jdoe@somemail.com
msmith
msmith@myorg.com
yworld
yworld@myorg.com
yw100@anothermail.biz

I don't understand why.

I will be thankful for your explanation!
 
Old 05-10-2010, 07:14 PM   #4
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,508

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
When you simply do
Code:
print $0
the entire record is simply printed out as it is (awk does not separate fields, so that it does not use OFS to delimit output fields). On the contrary, if you change the content of a field, the record is rebuilt from every single piece and the print statement actually prints out fields separated by OFS.

In my example the statement
Code:
$1=$1
just assigns to the first field its own value and the record is not really changed. I think that just referring to a field causes awk to separate them and rebuild the record thereafter.

Indeed, I used $1=$1 as a pattern, not an action. If you consider the syntax of an awk rule:
Code:
pattern { action }
$1=$1 is actually evaluated as an expression. Since it is not a comparison (otherwise I should have used ==) the expression is evaluated as a non-null string, which awk interprets as TRUE. As a consequence, since I did not specify any action, the default one (print $0) is used.

Equivalent rules are:
Code:
$1 = $1 { print $0 }
where the action is specified, or - more explicitly
Code:
{ $1 = $1; print $0 }
where the $1=$1 is a real assignment. This behavior is documented in the GNU awk manual, 3.4:
Quote:
...there are times when it is convenient to force awk to rebuild the entire record, using the current value of the fields and OFS. To do this, use the seemingly innocuous assignment:

$1 = $1 # force record to be reconstituted
print $0 # or whatever else with $0

This forces awk rebuild the record.
I hope it is a little more clear now, but please... consider that even my neurons are malfunctioning at this time..
 
Old 05-10-2010, 07:34 PM   #5
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Rep: Reputation: 713Reputation: 713Reputation: 713Reputation: 713Reputation: 713Reputation: 713Reputation: 713
Code:
awk 'BEGIN { RS="\n\n"; FS="\n" } { gsub(/\n/, " "); print $0 }'
 
Old 05-10-2010, 07:41 PM   #6
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,563

Rep: Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939
@colucix - great explanation

Only alteration to code I would make is set the RS to empty line
Code:
awk 'BEGIN { FS="\n"; RS="" } $1=$1' usersfile
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
awk question on handling *.CSV "text fields" in awk jschiwal Programming 8 05-27-2010 06:23 AM
[SOLVED] get fields using awk ashok.g Programming 9 12-09-2009 01:21 AM
AWK won't separate fields slinx Programming 2 03-10-2009 04:11 PM
modify all fields in awk tostay2003 Programming 16 08-09-2008 01:41 AM
shell command using awk fields inside awk one71 Programming 6 06-26-2008 04:11 PM


All times are GMT -5. The time now is 01:54 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration