specifying fields for printing in gawk from command line
If I want to print out only specified lines (fields) from a file using gawk, I've found can use a bash loop that looks something like this:
Code:
#!/bin/bash I think it would be better to do this entirely from within gawk so it can print out all the wanted fields at one time, but I'm not sure how to do it. I've been studying awk/gawk tutorials for hours but I can't figure it out. Should I try to use a for loop, an array, or what? Can any awk experts help me out? (Note, printing single lines is just for the example; the actual text I want to extract will be more complex, which is why I want to use awk instead of sed or other options.) |
Hello David the H. :)
A loop with the next command in it to iterate over the lines ... Best Charles |
Actually, to explain what I want in more detail, I have a text file that contains several hundred sections/records, and I want to be able to print out the records that I specify.
Each record consists of about a dozen lines, but not in a completely uniform pattern, which is why I need something like awk to parse them out. The only thing that's consistent is the starting line. The general pattern looks like this: Code:
#1# This is record 1. |
Quote:
That being the case "$@" is good but will not work just like that because bash expands "$@" to "$1" "$2" ... "$n" (where n may be max 10?). This is feature is usefule whne there is whitespace in the arguments. Bash would thus expand the gawk command would expand to Code:
gawk -v list="$1" "$2" ... "$n" <stuff> Will the arguments to the bash script be the numbers that appear between the "#" characters in "#1# This is record 1." and will they be in the same order they appear in your text file? If so, you could parse the first word out of "list" and set "list" to the remainder, start the outer loop and keep doing "next" statements until you match <n> in "#<n># This is record <n>.", when you could parse the next word out of "list" ready for the next match, start an inner loop doing "next" statements and printing each line until you find another "#<*># This is record <*>." when you break out of the inner loop and iterate the outer loop. |
Quote:
Quote:
Or since the records are in numerical order, it could just as well print "record number n" from the file, if that would be easier. I should be able to pass the arguments to the script in any order however, and the records should ideally be output in that same order. Actually, I've already found a way to do it with sed, but I have to pipe it through the command twice for each record I want. I'm sure awk would do a better job of it, once I figure out how. Quote:
|
Quote:
|
here's an approach not using RS.
Code:
#!/bin/bash Code:
# more file |
Hello David :)
Quote:
Best Charles |
Sorry to be late replying. I had a tiring couple of days.
Ghostdog74, Thank you so much. It works perfectly. Now I just need to go through it to understand exactly what it's doing. :) Of course I wasn't married to using RS or anything. I just didn't know of any other way to go about it. And Catkin, no, I don't absolutely NEED the output to be in the same order as the input, but it seems to me that a script should generally process things in the order that they're given. And having the output in a different order from the input can be a bit confusing sometimes. In any case, the code above does just what I want. |
All times are GMT -5. The time now is 08:23 PM. |