Will gawk extract bits of text fields from a few thousand identically structured file
I have several thousand small text files with the following structure:
<NAME> WATSON ANTHONY M </NAME>
<DOB> 01 21 70 </DOB>
<xref image="00003RHV.TIF|V3|1999:11:23:15:54:04.00|51981|0"> image: </xref>
-------<xref image="00003RHW.TIF|V3|1999:11:23:15:54:16.00|59254|0"> image: </xref>
-------<xref image="00003RHX.TIF|V3|1999:11:23:15:54:18.00|60390|0"> image: </xref>
-------<xref image="00003RHY.TIF|V3|1999:11:23:15:54:18.00|38973|0"> image: </xref>
Each file has different NAME value, different DOB value, and can contain from 1 to 50 "image" lines.
I am going to write a shell program to read in the couple of thousand text files, and output to a single file with the following format:
NAME1, DOB1, image1.tif
NAME1, DOB1, image2.tif
NAME1, DOB1, image3.tif
NAME2, DOB2, image4.tif
Is sed or awk (gawk) the best tool for this process?
Thanks in advance for assistance, I haven't used awk and sed in appx 8 years, and forget their specific attributes / suitability, but have programmed professionally for appx 10 years (which I gave up for Hardware Platform / Infrastructure Support about 8 years ago).