LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 05-23-2012, 01:34 AM   #1
acc_Wk
LQ Newbie
 
Registered: Jul 2011
Posts: 18

Rep: Reputation: Disabled
Awk inside Awk ?


Hi,
i'm using awk inside awk(via command substitution) and I want to apply the FS of the outer awk to inner awk..

My main goal is to actually insert a line into a table conforming to the format of that table :
Eg :- i ve two files : (i wasnt able to show multiple spaces as they were getting truncated, so i have indicated a space with <space>
---------------------------------------------------------
cat f1
hdr1<space>hdr2<space>hdr3<space>hdr4

abcd<space>bcef<space>pqrs<space>xyza
---------------------------------------------------------
cat f2
hdr1<space><space>hdr2<space><space>hdr3<space><space>hdr4

1111<space><space>2222<space><space>3333<space><space>4444

My desired output is :
----------------------------------------------------------
hdr1<space><space>hdr2<space><space>hdr3<space><space>hdr4

abcd<space><space>bcef<space><space>pqrs<space><space>xyza
1111<space><space>2222<space><space>3333<space><space>4444
----------------------------------------------------------

(observe that though headers are same in both files, spacing is different in the two files, so i want line in f1 in f2 with f2's format)

For this I was planning on using the following awk code :

cat f2 | awk -v "n=2" -v "s=` cat f1| sed 1d | awk 'BEGIN { FS = "<outer awk's FS>" }; {printf "%s %s %s %s %s %s %s\n", $1,$2,$3,$4,$5,$6,$7}'`" '(NR==n) { print s } 1'

Is this actually possible ? or is there better approach to achieve the desired output ? Any help/hints would be much appreciated!!

Thanks!!
 
Old 05-23-2012, 02:14 AM   #2
jhwilliams
Senior Member
 
Registered: Apr 2007
Location: Portland, OR
Distribution: Debian, Android, LFS
Posts: 1,168

Rep: Reputation: 211Reputation: 211Reputation: 211
The following AWK script would accomplish the output you desire:

Code:
{
    gsub(/[ \t]+/,"  ",$0)
}

/^hdr/ {
    if (0 == seen) {
        print $0"\n"
    }

    seen++
    next
}

!/^$/ {
    print $0
}
Save that as parse.awk, and then invoke like this:

Code:
awk -f parse.awk f1 f2
 
Old 05-23-2012, 02:33 AM   #3
acc_Wk
LQ Newbie
 
Registered: Jul 2011
Posts: 18

Original Poster
Rep: Reputation: Disabled
Hi,

Sorry I wasnt very clear about this.. The second file (f2) doesnt have a uniform spacing like (2 spaces between the cols), it can have a more random format like :

hdr1<space><space>hdr2<space><space><space><space>hdr3<space>hdr4
aaaa<space><space>bbbb<space><space><space><space>cccc<space>dddd

So basically there's no clarity on the spacings.. All that i'm sure of is, both the files have the same headers and I need to merge the (only) line from one file to the another file having many lines (with same column headers)..

Thanks for your time though!!
 
Old 05-23-2012, 02:52 AM   #4
jhwilliams
Senior Member
 
Registered: Apr 2007
Location: Portland, OR
Distribution: Debian, Android, LFS
Posts: 1,168

Rep: Reputation: 211Reputation: 211Reputation: 211
Hi acc, that shouldn't matter. Did you try it? The first action of the script I provided is to normalize spacing to two spaces, everywhere.
 
Old 05-23-2012, 03:46 AM   #5
acc_Wk
LQ Newbie
 
Registered: Jul 2011
Posts: 18

Original Poster
Rep: Reputation: Disabled
Hi jhwilliams, yes i did try.. but also i realized now, I am not giving the best sample example data..

The values under the headers extend longer than the headers...

I tried attaching a file but didnt work.. So please see if below is easy to understand ( _ == space)

Code:
f1 :
Header1_Header2_________Header3______Header4_Header5_Header6_____Header7

AAAAA___QQQQQQQ_________FFFFFF_______SSSSSS__-_______-___________-


f2:
Header1_____________________________Header2_________Header3______Header4_Header5_Header6_____Header7

AAAAAAAAAAAAAAAAAAAAAAAAAAA_________BBBBBBBB________CCCCCCC______DDDDDD__XXXXXX__-___________-
PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP_____EEEEEEEE________FFFFFFF______GGGGGG__YYYYYY__-___________-

desired:

Header1_____________________________Header2_________Header3______Header4_Header5_Header6_____Header7

AAAAA_______________________________QQQQQQQ_________FFFFFF_______SSSSSS__-_______-___________-
AAAAAAAAAAAAAAAAAAAAAAAAAAA_________BBBBBBBB________CCCCCCC______DDDDDD__XXXXXX__-___________-
PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP_____EEEEEEEE________FFFFFFF______GGGGGG__YYYYYY__-___________-
Ok i really am finding it hard to convey the intended format.. In edit mode, it looks ok but once posted the format is changing.. Anyway i hope i've given a fair idea of what i'm looking for.. In the above sample data, the only thing to keep in mind for column values is, they start from where it's corresponding column header starts.. barring that the above sample data is fine..

Appreciate your helP!!

Thanks!

Last edited by colucix; 05-24-2012 at 06:27 PM. Reason: Added CODE tags as per request in post #14
 
Old 05-23-2012, 05:30 AM   #6
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 20,827

Rep: Reputation: 4006Reputation: 4006Reputation: 4006Reputation: 4006Reputation: 4006Reputation: 4006Reputation: 4006Reputation: 4006Reputation: 4006Reputation: 4006Reputation: 4006
Use [code] tags as above - from "advanced" it's the "#" token.
 
Old 05-23-2012, 06:46 AM   #7
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,976

Rep: Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181
Maybe something like:
Code:
awk 'FNR==NR || (FNR > 1 && NF){$1=$1;print}' OFS="  " f1 f2
 
Old 05-23-2012, 06:57 AM   #8
acc_Wk
LQ Newbie
 
Registered: Jul 2011
Posts: 18

Original Poster
Rep: Reputation: Disabled
Hi,

Thanks syg00

So finally, here is the format :
Code:
f1
Header1 Header2         Header3      Header4 Header5 Header6     Header7

AAAAA   QQQQQQQ         FFFFFF       SSSSSS  -       -           -

___________________________________________________________________________________________________
f2
Header1                             Header2         Header3      Header4 Header5 Header6     Header7

AAAAAAAAAAAAAAAAAAAAAAAAAAA         BBBBBBBB        CCCCCCC      DDDDDD  XXXXXX  -           -
PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP     EEEEEEEE        FFFFFFF      GGGGGG  YYYYYY  -           -

___________________________________________________________________________________________________
Desired :

Header1                             Header2         Header3      Header4 Header5 Header6     Header7

AAAAA                               QQQQQQQ         FFFFFF       SSSSSS  -       -           -
AAAAAAAAAAAAAAAAAAAAAAAAAAA         BBBBBBBB        CCCCCCC      DDDDDD  XXXXXX  -           -
PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP     EEEEEEEE        FFFFFFF      GGGGGG  YYYYYY  -           -
@grail : It didnt solve the formatting problem for me , when I gave the two files as inputs to the awk line you suggested...

Code:
# 
awk 'FNR==NR || (FNR > 1 && NF){$1=$1;print}' OFS="  " t1 t2
Header1  Header2  Header3  Header4  Header5  Header6  Header7

AAAAA  QQQQQQQ  FFFFFF  SSSSSS  -  -  -
AAAAAAAAAAAAAAAAAAAAAAAAAAA  BBBBBBBB  CCCCCCC  DDDDDD  XXXXXX  -  -
PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP  EEEEEEEE  FFFFFFF  GGGGGG  YYYYYY  -  -
 
Old 05-23-2012, 08:29 AM   #9
acc_Wk
LQ Newbie
 
Registered: Jul 2011
Posts: 18

Original Poster
Rep: Reputation: Disabled
Hi,

Thanks syg00

So finally, here is the format :
Code:
f1
Header1 Header2         Header3      Header4 Header5 Header6     Header7

AAAAA   QQQQQQQ         FFFFFF       SSSSSS  -       -           -

___________________________________________________________________________________________________
f2
Header1                             Header2         Header3      Header4 Header5 Header6     Header7

AAAAAAAAAAAAAAAAAAAAAAAAAAA         BBBBBBBB        CCCCCCC      DDDDDD  XXXXXX  -           -
PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP     EEEEEEEE        FFFFFFF      GGGGGG  YYYYYY  -           -

___________________________________________________________________________________________________
Desired :

Header1                             Header2         Header3      Header4 Header5 Header6     Header7

AAAAA                               QQQQQQQ         FFFFFF       SSSSSS  -       -           -
AAAAAAAAAAAAAAAAAAAAAAAAAAA         BBBBBBBB        CCCCCCC      DDDDDD  XXXXXX  -           -
PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP     EEEEEEEE        FFFFFFF      GGGGGG  YYYYYY  -           -
@grail : It didnt solve the formatting problem for me , when I gave the two files as inputs to the awk line you suggested...

Code:
# 
awk 'FNR==NR || (FNR > 1 && NF){$1=$1;print}' OFS="  " t1 t2
Header1  Header2  Header3  Header4  Header5  Header6  Header7

AAAAA  QQQQQQQ  FFFFFF  SSSSSS  -  -  -
AAAAAAAAAAAAAAAAAAAAAAAAAAA  BBBBBBBB  CCCCCCC  DDDDDD  XXXXXX  -  -
PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP  EEEEEEEE  FFFFFFF  GGGGGG  YYYYYY  -  -
 
Old 05-23-2012, 09:47 AM   #10
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Quote:
Originally Posted by acc_Wk View Post
In the above sample data, the only thing to keep in mind for column values is, they start from where it's corresponding column header starts.. barring that the above sample data is fine..
In your words there is the solution. Just check where the headers start and set the proper format for each field. This assumes that all the headers are different from each other, otherwise the index function fails because it returns the starting position of the leftmost occurrence.
Code:
awk 'BEGIN {

  getline < ARGV[2]
  
  for ( i = 1; i <= NF; i++ )
    n[i] = index($0, $i)
    
  for ( i = 2; i <= NF; i++ )
    f[i-1] = ( "%-" n[i]-n[i-1] "s" )
  
  f[i-1] = "%-s"
  
  print
  print ""
  
}

FNR > 2 {

  for ( i = 1; i <= NF; i++ )
    printf f[i], $i
  
  print ""
  
}' f1 f2
Notice that the arguments must be exactly in this order, f1 then f2. The BEGIN section reads the header of the second file to retrieve the format (and prints it followed by an empty line, as in your example). Then the content of f1 is printed out with the format retrieved from f2. Finally the content of f2 is appended. Hope this helps.
 
Old 05-23-2012, 12:54 PM   #11
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,976

Rep: Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181
Quote:
It didnt solve the formatting problem for me , when I gave the two files as inputs to the awk line you suggested...
Actually ... it provided exactly the output you initially requested, ie all data from both files with only a single header and 2 spaces between each data item.

Your new requirement to have the data formatted in columns is best suited to the column command:
Code:
awk 'FNR==NR || (FNR > 1 && NF){$1=$1;print}' f1 f2 | column -t
If you still want the blank line you can play with awk some more.
 
Old 05-23-2012, 06:41 PM   #12
jhwilliams
Senior Member
 
Registered: Apr 2007
Location: Portland, OR
Distribution: Debian, Android, LFS
Posts: 1,168

Rep: Reputation: 211Reputation: 211Reputation: 211
Here's a tweak to my original post, to meet your additional "pretty print" requirement:

Code:
column -t f1 f2 | awk '
    /^Header/ {
        if (!seen) {
            print $0"\n"
        }
    
        seen++
        next
    } 

    !/^$/'
 
Old 05-24-2012, 02:21 AM   #13
acc_Wk
LQ Newbie
 
Registered: Jul 2011
Posts: 18

Original Poster
Rep: Reputation: Disabled
colucix, grail, jhwilliams... Thanks so much!! I was struggling for a long time to get one and now I have 3 ways for it.. You guys are awKsome! .. Thanks!
 
Old 05-24-2012, 06:15 PM   #14
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Could you please go back and edit your previous posts to include code tags as well? The long lines are making my screen side-scroll. Thanks.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Can I use grep inside awk? Helene Programming 10 09-29-2015 08:48 PM
[SOLVED] awk: how can I assign value to a shell variable inside awk? quanba Programming 6 03-23-2010 02:18 AM
Sed inside awk ZAMO Linux - General 1 02-26-2009 04:13 AM
shell command using awk fields inside awk one71 Programming 6 06-26-2008 04:11 PM
awk inside a makefile linux.fob Programming 2 10-12-2005 04:57 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 11:14 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration