I use awk for everything, including the laundry.
You search for lines that contain "[thr". When you find such a line, reset the line counter to 1, then peel off the value between the square brackets. This can be done with the sub() function, for example sub(/.*\[/,"") removes everything up to and including the opening bracket, and sub(/\].*/,"") removes everything starting with the closing bracket. Remember what remains after the two sub()'s. For example:
Code:
/\[thr/ { linectr = 1
sub(/.*\[/,"")
sub(/\].*/,"")
thr = $0
next }
Note that $0 refers to the entire line, but after stripping the unwanted parts. The
next directive advances to the next line.
When awk encounters other lines, it just increments the counter:
How are you going to output the line count? Whenever the line contains [thr, you print the thr number and line count of the previous block. The only time you don't do this is at the first ocurrence of [thr. I assume that this is the first line of the file. Thus, the entire program looks like this:
Code:
/\[thr/ { if (NR>1) print thr " ; " linectr
linectr = 1
sub(/.*\[/,"")
sub(/\].*/,"")
thr = $0
next }
{ linectr += 1 }
Put this in a file, for example
process-thr.awk, and run
awk -f process-thr.awk YOUR-INPUT.
WARNING: This is not tested. I assume that the first thr block starts in the first line, and that the file contains no other lines than thr blocks.
This can certainly be improved. Instead of counting the lines, for example, one could use the number at the beginning of each line.