-   Linux - General (
-   -   how to use grep/sed (

tnik 11-01-2007 08:00 AM

how to use grep/sed
I have a text file I'm trying to split up into multiple files.. the format is like this:


O1234 (filename.ext)
(text here)
X123.23 Y023.
G42 X2234. Y0.

01235 (diffilename.ext)
(text here)
X123.23 Y023.
G42 X2234. Y0.
X123.23 Y023.

(text here)
X123.23 Y023.
G42 X2234. Y0.
X123.23 Y023.

Now what I would like to do is parse the file, if (filename.ext) exists after the O##### then name the file that, and put everything from O### to M30 in it. If it doesn't exist, just use the O##### as the filename.

I'm not even sure where to start on this.. Any help would be greatly appreciated.



Agrouf 11-01-2007 11:43 AM

It's not sed but


cat file | while read line
        x1=$(echo $line | cut -f1 -d" ")
        echo $x1 | grep -q '^0[0-9]*'
        if [ $? = 0 ]
          x2=$(echo $line | cut -f2 -d" ")
          [ -n $x2 ] && filename=$(echo $x2 | tr -d '()')
        echo "$line" >>$filename

Tinkster 11-04-2007 12:18 AM

And an awk-version ... :}

  if( $1 ~ /\(/ ){
    file=gensub( /.+ \(([^\)]+)\)/ , "\\1", 1, $1)
  print $0 > file


sundialsvcs 11-06-2007 07:51 PM

There are four great tools in the Unix/Linux world that are terrific for handling problems like these. I'll introduce them individually....

sed ("stream editor") is very useful when you have a single file that you want to do something to, to produce another single file as output. (In Linux/Unix-land, this idea is often applied as a "filter" when "piping" things ... but that's another story...)

grep is a great tool for finding which files contain a particular string. It grows on you... For example, when I needed to find all of the files in a great-big directory (which contains over 3,500 files in various subdirectories) which contained the word "arp" as a whole-word (that is to say, surrounded on both sides by a character that is not a letter), regardless of UPPer or LoWeR CAse, I "merely" typed: grep -rilw arp ~/projects/* Nothin' to it... ;)

awk is probably the tool that you want in this case. The file that you need to process has certain definitely-identifiable characteristics, such as:
  • There's one "record" per "line," and it seems that "fields" in each "record" are separated by "one or more spaces."
  • "A line that begins with 'O' followed by one-or-more 'digits'" marks the beginning of "something I am interested in," and when I see such a record, "the second field" (filename) "is interesting."
  • After I have seen a record like that, zero-or-more records contain useful text...
  • "But when I see a record starting with 'X' followed by zero-or-more 'digits'" I want to...
Well, you get the idea. awk is a tool that's designed for things like that.

For the truly adventurous, the programming language perl was actually designed by a person who started his quest by "extending awk" and ... well ... "one thing lead to another," as things in our peculiar industry so-often do.

All times are GMT -5. The time now is 03:50 PM.