LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 07-23-2014, 10:47 AM   #1
jonnybinthemix
Member
 
Registered: May 2014
Location: Bristol, United Kingdom
Distribution: RHEL 5 & 6
Posts: 132

Rep: Reputation: Disabled
Combining multiple AWK commands


Hey Guys,

I was wondering if there's a way and (if there is) do I need to change the format to combine multiple AWK commands?

I'm currently achieving what I want by doing the following:

Code:
diff -b $RLS $LLS | awk '{print $2}' | awk '$1=$1' | awk '{print "BAT_"$0".pgp" }' | while read i; do
I've tried just putting all the commands after one AWK but it displays an error. Is there a special syntax? Or have I done it right this way?

Thanks
Jon
 
Old 07-23-2014, 10:49 AM   #2
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 5,028

Rep: Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845
without hte full context it seems like this would be the same:
Code:
diff -b $RLS $LLS | awk '{print "BAT_" $2 ".pgp"}' | while read i; do
 
Old 07-23-2014, 11:00 AM   #3
jonnybinthemix
Member
 
Registered: May 2014
Location: Bristol, United Kingdom
Distribution: RHEL 5 & 6
Posts: 132

Original Poster
Rep: Reputation: Disabled
Ah nice, that works

Makes sense... $2 straight into print..

Thanks, appreciate your help

But, if I wanted to combine multiple commands next time, can I just stack them up? Comma separated or something?
 
Old 07-23-2014, 01:07 PM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,254

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
I am not sure I followed the point of the original logic, specifically the need for the awk in the middle?
The first awk will only ever return a contiguous group of characters (or nothing assuming less than 2 fields), so the second awk, which would typically be used to remove
any additional whitespace, would have nothing to do.

As for grouping statements, it would really depend on just how much work each individual awk is doing and what the output is from each to the next.
 
Old 07-24-2014, 04:25 AM   #5
jonnybinthemix
Member
 
Registered: May 2014
Location: Bristol, United Kingdom
Distribution: RHEL 5 & 6
Posts: 132

Original Poster
Rep: Reputation: Disabled
Hi Grail,

Thank for your message..

The logic behind the original commands were, using diff to check the difference of each variable (contains a list of filenames) - Then awk to just give me the file name (and omit the < or > at the beginning), Then I was noticing there was random white space, so I added the second awk command to make sure theres no white space and the third awk command to add BAT_ to the beginning of each filename and .pgp to the end of each filename.

The reason for the above if I'm downloading encrypted image files from an SFTP Server via script, and want to only download new images. However the existing images which have been downloaded, have already been decrypted and had the 'BAT_' removed and as they're decrypted the '.pgp' has also gone. So, to get two lists and do a comparison I first must take a list of what is on the server, strip the BAT_ & .pgp off, compare the two lists, take the differences, add the _BAT & .pgp to the filenames again and then tell a loop to download all files.

I've added the first part of the code to help explain my meaning.. (It all works without hitch, but I'm really happy to listen to other ways of doing it and if I'm not doing it the best way, I'd love to learn).

Code:
/usr/bin/expect <<! > $FTPLIST
        spawn sftp -o$PORT $USER@$HOST
        expect "password:"
        send "$PASS\r"
        expect "sftp>"
        send "cd output\r"
        expect "sftp>"
        send "ls -1 *.JPEG.pgp\r"
        send "bye\r"
        expect eof
!
grep 'BAT_' $FTPLIST | cut -c5- | sed 's/\(.*\)\..*/\1/' > $RLS

ls -1 > $LLS

diff -b $RLS $LLS | awk '{print "BAT_"$2".pgp" }' | while read i; do
The loop then goes on to download $i within another expect session.
 
Old 07-24-2014, 08:24 AM   #6
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 5,028

Rep: Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845
Quote:
Originally Posted by jonnybinthemix View Post
... (It all works without hitch, but I'm really happy to listen to other ways of doing it and if I'm not doing it the best way, I'd love to learn).

Code:
/usr/bin/expect <<! > $FTPLIST
        spawn sftp -o$PORT $USER@$HOST
        expect "password:"
        send "$PASS\r"
        expect "sftp>"
        send "cd output\r"
        expect "sftp>"
        send "ls -1 *.JPEG.pgp\r"
        send "bye\r"
        expect eof
!
grep 'BAT_' $FTPLIST | cut -c5- | sed 's/\(.*\)\..*/\1/' > $RLS

ls -1 > $LLS

diff -b $RLS $LLS | awk '{print "BAT_"$2".pgp" }' | while read i; do
The loop then goes on to download $i within another expect session.
the obvious best way would be to use ssh with keys so passwords arent necessary ?
 
Old 07-24-2014, 08:46 AM   #7
jonnybinthemix
Member
 
Registered: May 2014
Location: Bristol, United Kingdom
Distribution: RHEL 5 & 6
Posts: 132

Original Poster
Rep: Reputation: Disabled
Yes that would be nice, unfortunately I don't have this as an option.
 
Old 07-24-2014, 09:04 AM   #8
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian i686 (solaris)
Posts: 8,122

Rep: Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270
you can also simplify the grep|cut|sed chain.
Code:
(not tested, because there is no sample input)
awk '/BAT_/ { a=substr($0, 5); b=split(a, "."); print b[0] } ' $FTPLIST > $RLS

diff can handle stdin, so:
ls -1 | diff -b $RLS - | awk '{print "BAT_"$2".pgp" }' | while ...
should work too
 
Old 07-24-2014, 10:29 AM   #9
jonnybinthemix
Member
 
Registered: May 2014
Location: Bristol, United Kingdom
Distribution: RHEL 5 & 6
Posts: 132

Original Poster
Rep: Reputation: Disabled
Hi Pan64,

Thanks for your response.. What you've suggested looks interesting.

Sorry to be a pain, but if you've time and it's not too complex would you be able to explain your chain? The section within the {} looks new to me and I'd love to understand it as apposed to just use it

Thanks
Jon
 
Old 07-24-2014, 11:45 AM   #10
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,254

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
Actually I think pan64 has made a small mistake, but I understand where he was going. The mistake is that split returns the number of items after the split, whereas the second argument is
where we should place the 'b' variable.

So the re-write would be:
Code:
awk '/BAT_/ { a=substr($0, 5); split(a, b, "."); print b[1] }' $FTPLIST > $RLS
As a break down:

1. /BAT_/ :- Search for lines containing the string 'BAT_'

2. a=substr($0, 5) :- Assign to the variable 'a' everything stored in the record staring from the fifth character, ie. remove 'BAT_' ... which assumes we find only files starting with this string

3. split(a, b, ".") :- Split the data stored in variable 'a' using period ('.') as the separator and store each piece in the array 'b'

4. print b[1] :- Print the data stored in the first element of the array 'b' (awk arrays are indexed from 1 and not 0 {most of the time})

If you really wanted to, I believe you could perform the whole task in awk or bash and even at the point of not having to remove and re-add portions ... should be a nice challenge
 
2 members found this post helpful.
Old 07-25-2014, 01:12 AM   #11
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian i686 (solaris)
Posts: 8,122

Rep: Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270
thanks grail, that was the split of perl or python.
2. a=substr($0, 5) is more or less the same as your cut -c5- command
3. and 4. split the data using . and printing the first part - that works like the sed you gave.
 
Old 07-25-2014, 05:29 AM   #12
jonnybinthemix
Member
 
Registered: May 2014
Location: Bristol, United Kingdom
Distribution: RHEL 5 & 6
Posts: 132

Original Poster
Rep: Reputation: Disabled
Hey Guys,

Thanks for the responses.. I've been playing around with the above and it works nice.

However, the command print b[1] of course prints the first section of the array, which in this instance is just the filename.

How do I print multiple sections of the array? For example if the array (when split) has 3 parts.. how would I print parts 1 & 2 and omit just part three?

For example the filenames in the $FTPLIST variable are looking like:

BAT_123456.JPEG.pgp
BAT_234567.JPEG.pgp
BAT_345678.JPEG.pgp

So I can use; awk ' /BAT_/' to display only the above files within that variable (works fine)

then a=substr($0, 5) to print the filename from the 5th character... and getting rid of the BAT_ (works fine)

then split (a, b, ".") to create an array named b, containing each section of the filename with "." separation (works fine, because if I change print b[2] it corresponds and prints JPEG)

then print b[1] which prints the first part of the array, which in this instance would be; 123456 234567 345678 (works)

So, I think I've understood it all okay... as it makes sense.

But, if I wanted to print 123456.JPEG 234567.JPEG 345678.JPEG how would I print both parts together?

I've tried print b[1]; print b[2] - which just prints both parts separately.

I've tried print b[1,2] which doesn't work. I've also tried print b[1-2] which doesn't work.

Any ideas?

Thanks, Jon
 
Old 07-25-2014, 05:44 AM   #13
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian i686 (solaris)
Posts: 8,122

Rep: Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270
probably print b[1]" "b[2] will do that, but I'm not really sure I understand it well
 
Old 07-25-2014, 05:58 AM   #14
jonnybinthemix
Member
 
Registered: May 2014
Location: Bristol, United Kingdom
Distribution: RHEL 5 & 6
Posts: 132

Original Poster
Rep: Reputation: Disabled
aha.. thanks

I played around with it and this works perfectly:

Code:
awk '/BAT_/ { a=substr($0, 5) split(a, b, "."); print b[1]"." b[2]}' $FTPLIST
I needed the filename 123456.JPEG, but adding the "."b[2] did that without a hitch
 
Old 07-25-2014, 07:50 AM   #15
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,254

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
As usual, always more than one way to skin things
Code:
awk 'match($0,/BAT_(.*)[.]pgp/,a){print a[1]}' $FTPLIST
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Need help combining two awk commands petemac117 Linux - Newbie 17 02-26-2014 06:32 AM
[SOLVED] Combining With awk If Possible: ali2011 Programming 1 01-14-2012 05:38 PM
[SOLVED] Combining Two Files Using AWK ali2011 Programming 8 12-15-2011 11:03 PM
AWK - combining multiple columns AlexYZ Programming 5 02-24-2010 08:09 AM


All times are GMT -5. The time now is 10:46 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration