LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Bash loop over multi-line blocks (https://www.linuxquestions.org/questions/linux-newbie-8/bash-loop-over-multi-line-blocks-930867/)

zx_ 02-23-2012 07:43 AM

Bash loop over multi-line blocks
 
I want to process text file that basically looks like this:

Code:

...
Title: Some title
Auth.: Some author
Desc.: Some description

Title: Another title
Auth.: Another author
Desc.: Another description
...

So it consists of blocks of three lines separated by empty line.
I want to loop over each block, after which I'll know how to loop over each line in a block ("while read line"), then extract data I need and process with command I intended.

I expect some advanced command like awk would do it, but I'm too new on Linux to start messing with awk, and maybe there is easier way I'm not aware.

TIA

grail 02-23-2012 09:57 AM

What have you tried? You mention 'while/read', is this in bash? If so, where are you stuck?

Here is some reading material which may help:

http://tldp.org/LDP/abs/html/
http://mywiki.wooledge.org/TitleIndex

David the H. 02-23-2012 10:05 AM

It depends to some extent on exactly what you intend to do with the text blocks.

In general, you probably shouldn't be thinking in terms of blocks in a shell loop anyway. Shell processing is mostly based around simple text strings and single-character delimiters, and there's generally no easy way to work with complex delimiters (two or more consecutive newlines, in this case).

Your best bet is to just iterate line-by-line, and use conditional expressions to tell it what to do with each line it reads (and particularly the blank lines, or whatever your "block" terminator is).

For example, let's say you want to store each block of text in its own array element (so you can loop through them separately later):

Code:

n=0
while read line; do

        #if line is empty, increment the array no, and move on to the next line.
        if [[ -z $line ]]; then       
                (( n++ ))
                continue

        #otherwise append the line to the current array element "n".
        else
                array[n]+=${array[n]:+$'\n'}$line
        fi

done <file.txt


"${array[n]:+$'\n'}$line", BTW, is a trick using parameter substitution and ansi-c-style quoting, to insert a newline only if there's already text inside the array element. In a script you probably have to enable shopt -s extquote first.



Your other usual option is to use an external tool of some kind to split or process the text first into something more manageable by the shell.

This example uses awk to replace the double-newlines with '@' characters, which read can then use as a delimiter.

Code:

while read -d '@' line; do

        array[n++]="$line"

done < <( awk -v RS='\n\n' '{ printf "%s@" , $0 }' file.txt )


schneidz 02-23-2012 10:17 AM

Code:

cat test.tmp | while read line; do if [ "$line" = "" ] ; then  echo hello-world; else echo line = $line; fi; done

zx_ 02-23-2012 11:02 PM

OK, thanks guys, I guess it's little too early for awk, as it's feasible to do it like you all suggested: use "while read" and assign variables in the way, then if line is empty do commands and reset variables.

Cheers

PS reset is because sometimes not all three fields are present

David the H. 02-26-2012 11:02 AM

It's never too early to learn the tools of the trade. I generally recommend learning at least the basic usage of sed, awk, and find as soon as possible, as well as the basics of regular expressions, which all three programs (and bash itself) support.

While it can take a long time to truly understand all their nuances, it's really not that hard to reach a reasonable level of proficiency, and doing so will enhance your scripting ability many-fold.

Here are a few useful sed references.
http://www.grymoire.com/Unix/Sed.html
http://sed.sourceforge.net/grabbag/
http://sed.sourceforge.net/sedfaq.html
http://sed.sourceforge.net/sed1line.txt

Here are a few useful awk references:
http://www.grymoire.com/Unix/Awk.html
http://www.gnu.org/software/gawk/man...ode/index.html
http://www.pement.org/awk/awk1line.txt
http://www.catonmat.net/blog/awk-one...ined-part-one/

Here are a couple of links about using find:
http://mywiki.wooledge.org/UsingFind
http://www.grymoire.com/Unix/Find.html

A couple of regular expressions tutorials:
http://mywiki.wooledge.org/RegularExpression
http://www.grymoire.com/Unix/Regular.html


All times are GMT -5. The time now is 04:51 PM.