Extracting text with grep or awk?
Hey, kids!
I have a couple of text file reports in a format like this: 8/25/2004 8:23:30 AM 0x1000000 0x1 5,059 E:\Project 68950\Project 68950 2-up cover report 8/25/2004 8:23:32 AM 0x1000000 0x1 675,328 E:\Project 68950\Project 68950 2-up covers 7/13/1990 1:00:14 PM 0x1000000 0x1 0 E:\Project 68950\Fonts\Helve 8/5/1993 5:02:32 AM 0x1000000 0x1 0 E:\Project 68950\Fonts\HelveNeuMedCon 8/25/2004 8:19:02 AM 0x1000000 0x1 0 E:\Project 68950\Fonts\Helvetica 8/13/2004 10:49:30 AM 0x1000000 0x1 0 E:\Project 68950\Fonts\Helvetica Neue Condensed 3 All I need out of it are the "Project 68950" sections, and preferably only unique occurrences. Columns are separated by varying numbers of spaces, rather than tabs, but the text I need will always appear between the first pair of backslashes in each line. Because of this I thought grep might be the way to go, but can't figure out how to do it. And unfortunately I know more about full-contact knitting than I do about awk. Any advice? (For added marks: in the other file the text I need is between the second and third backslashes.) Cheers! --Gord |
In awk ...
Code:
awk '{for(i=7;i<=NF;i++)printf $i" ";print""}' <report_name> | uniq Code:
sed 's/.*\(E:\)/\1/g' <report_name>|uniq Cheers, Tink |
Tink!
Many thanks. Unfortunately, those give me the whole path (e.g., "E:\Project 68950\Project 68950 2-up cover report") for each file/directory in "Project 68950," when all I want is unique occurrences of "Project 68950." These reports are the contents of backup tapes and all I need to know is which projects are on them, not every file. I dug around in the man pages of awk and sed to figure out your examples and I think I burst a blood vessel. If you can take another crack I'd appreciate it. Cheers! --Gord |
Oh ... I got you wrong the first time round, that makes it
even easier ;) ... when you referred to it as section I assumed you wanted the font names. awk -F\\ '{print $2}' <report_name> | uniq Cheers, Tink |
Perfect! Grazie!
|
Quote:
Cheers, Tink |
All times are GMT -5. The time now is 05:34 PM. |