Combining multiple AWK commands
Hey Guys,
I was wondering if there's a way and (if there is) do I need to change the format to combine multiple AWK commands? I'm currently achieving what I want by doing the following: Code:
diff -b $RLS $LLS | awk '{print $2}' | awk '$1=$1' | awk '{print "BAT_"$0".pgp" }' | while read i; do Thanks Jon |
without hte full context it seems like this would be the same:
Code:
diff -b $RLS $LLS | awk '{print "BAT_" $2 ".pgp"}' | while read i; do |
Ah nice, that works :)
Makes sense... $2 straight into print.. Thanks, appreciate your help :D But, if I wanted to combine multiple commands next time, can I just stack them up? Comma separated or something? |
I am not sure I followed the point of the original logic, specifically the need for the awk in the middle?
The first awk will only ever return a contiguous group of characters (or nothing assuming less than 2 fields), so the second awk, which would typically be used to remove any additional whitespace, would have nothing to do. As for grouping statements, it would really depend on just how much work each individual awk is doing and what the output is from each to the next. |
Hi Grail,
Thank for your message.. The logic behind the original commands were, using diff to check the difference of each variable (contains a list of filenames) - Then awk to just give me the file name (and omit the < or > at the beginning), Then I was noticing there was random white space, so I added the second awk command to make sure theres no white space and the third awk command to add BAT_ to the beginning of each filename and .pgp to the end of each filename. The reason for the above if I'm downloading encrypted image files from an SFTP Server via script, and want to only download new images. However the existing images which have been downloaded, have already been decrypted and had the 'BAT_' removed and as they're decrypted the '.pgp' has also gone. So, to get two lists and do a comparison I first must take a list of what is on the server, strip the BAT_ & .pgp off, compare the two lists, take the differences, add the _BAT & .pgp to the filenames again and then tell a loop to download all files. I've added the first part of the code to help explain my meaning.. (It all works without hitch, but I'm really happy to listen to other ways of doing it and if I'm not doing it the best way, I'd love to learn). Code:
/usr/bin/expect <<! > $FTPLIST |
Quote:
|
Yes that would be nice, unfortunately I don't have this as an option.
|
you can also simplify the grep|cut|sed chain.
Code:
(not tested, because there is no sample input) |
Hi Pan64,
Thanks for your response.. What you've suggested looks interesting. Sorry to be a pain, but if you've time and it's not too complex would you be able to explain your chain? The section within the {} looks new to me and I'd love to understand it as apposed to just use it :) Thanks Jon |
Actually I think pan64 has made a small mistake, but I understand where he was going. The mistake is that split returns the number of items after the split, whereas the second argument is
where we should place the 'b' variable. So the re-write would be: Code:
awk '/BAT_/ { a=substr($0, 5); split(a, b, "."); print b[1] }' $FTPLIST > $RLS 1. /BAT_/ :- Search for lines containing the string 'BAT_' 2. a=substr($0, 5) :- Assign to the variable 'a' everything stored in the record staring from the fifth character, ie. remove 'BAT_' ... which assumes we find only files starting with this string 3. split(a, b, ".") :- Split the data stored in variable 'a' using period ('.') as the separator and store each piece in the array 'b' 4. print b[1] :- Print the data stored in the first element of the array 'b' (awk arrays are indexed from 1 and not 0 {most of the time}) If you really wanted to, I believe you could perform the whole task in awk or bash and even at the point of not having to remove and re-add portions ... should be a nice challenge :) |
thanks grail, that was the split of perl or python.
2. a=substr($0, 5) is more or less the same as your cut -c5- command 3. and 4. split the data using . and printing the first part - that works like the sed you gave. |
Hey Guys,
Thanks for the responses.. I've been playing around with the above and it works nice. However, the command print b[1] of course prints the first section of the array, which in this instance is just the filename. How do I print multiple sections of the array? For example if the array (when split) has 3 parts.. how would I print parts 1 & 2 and omit just part three? For example the filenames in the $FTPLIST variable are looking like: BAT_123456.JPEG.pgp BAT_234567.JPEG.pgp BAT_345678.JPEG.pgp So I can use; awk ' /BAT_/' to display only the above files within that variable (works fine) then a=substr($0, 5) to print the filename from the 5th character... and getting rid of the BAT_ (works fine) then split (a, b, ".") to create an array named b, containing each section of the filename with "." separation (works fine, because if I change print b[2] it corresponds and prints JPEG) then print b[1] which prints the first part of the array, which in this instance would be; 123456 234567 345678 (works) So, I think I've understood it all okay... as it makes sense. But, if I wanted to print 123456.JPEG 234567.JPEG 345678.JPEG how would I print both parts together? I've tried print b[1]; print b[2] - which just prints both parts separately. I've tried print b[1,2] which doesn't work. I've also tried print b[1-2] which doesn't work. Any ideas? Thanks, Jon |
probably print b[1]" "b[2] will do that, but I'm not really sure I understand it well
|
aha.. thanks :)
I played around with it and this works perfectly: Code:
awk '/BAT_/ { a=substr($0, 5) split(a, b, "."); print b[1]"." b[2]}' $FTPLIST |
As usual, always more than one way to skin things :)
Code:
awk 'match($0,/BAT_(.*)[.]pgp/,a){print a[1]}' $FTPLIST |
All times are GMT -5. The time now is 01:31 PM. |