LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   split files by specifying a string (bash shell) (https://www.linuxquestions.org/questions/programming-9/split-files-by-specifying-a-string-bash-shell-595506/)

vikas027 10-29-2007 11:20 AM

split files by specifying a string (bash shell)
 
Hi all,

I have a file of around 300 lines in which string "SERVER" occurs around 32 times.

for eg.
Quote:

SERVER
.....
.....
....

SERVER

.....
.....
....

SERVER.....
.....
....
I need to split files like, for eg

Quote:

file1
SERVER
....
....
....

file2
SERVER
.....
...
....


file3
SERVER
.....
....
....
I am using this code
awk '/SERVER/{n++}{print > f n}' f=/vikas/list /vikas/final

But the problem is that it makes maximum of 10 files, but I need more than 30.
I have tried using nawk, but didnt worked.
I am using bash scripting on Sun OS.


Any other way of splitting this data ???

Pls help !!!

Thanks in adv.
Regards,
Vikas

colucix 10-29-2007 03:55 PM

Yeah, there is a limit in solaris awk for the number of simultaneously opened files, but using nawk you can close the file terminating the streaming output. In this way you can write an unlimited number of files. I am thinking about something like this:
Code:

nawk '/SERVER/{close(oufile) ; n++ ; oufile=sprintf("%s%02s",f,n)}{ print > oufile }' f=/vikas/list /vikas/final
I hope this helps.

ghostdog74 10-29-2007 07:01 PM

Code:

awk '
    /SERVER/{
        close("out"c);
        ++c;
    }
    {
      print > "out"c
    }

' "file"

use nawk on Solaris

colucix 10-30-2007 03:00 AM

ghostdog74, maybe I am missing something, but what is the improvement of your replay in respect of mine? "Use indentation"?! "Don't use external variables"?! ...

ghostdog74 10-30-2007 08:57 AM

Quote:

Originally Posted by colucix (Post 2941887)
ghostdog74, maybe I am missing something, but what is the improvement of your replay in respect of mine? "Use indentation"?! "Don't use external variables"?! ...

hmm..let's see, one extra call to sprintf() ? may not be significant in this case. just that i like to keep it simple, that's all. And yes, since you mentioned indentation, it also play a part, for readability.

colucix 10-30-2007 09:31 AM

Hmmm... it looks like you've looked for details after my last post and not really cared about the previous solution. The same for editing your code, by removing an unuseful part. The call to sprintf in my contribution just makes the filenames of the same length. Anyway, I am not questioning about the code, the solution, nor about your ability in awk programming. I saw other posts from you and you're really good. No kiddin'. What makes me angry sometimes, is the fact that a lot of people reply to the OP without caring about replies from other members. This is a little unfair since every little contribution is worth to be read before evaluating that something more can be told: comments, add-ons, notes, disapprovals... whatever! Maybe, just my personal opinion.

ghostdog74 10-30-2007 09:55 AM

wow you are so sensitive and so serious...relax....i sometimes feel the same way, but so what?! life goes on. an example, sometimes, you know someone posted a homework/assignment, and you refrain from answering, but so what ??, someone will just come in and provide an answer...should i get angry every time such things happen? i just move on.. there are more important things to do in life than be concerned about minor things such as this.

makyo 10-30-2007 10:20 AM

Hi.
Quote:

Originally Posted by colucix (Post 2942255)
... What makes me angry sometimes, is the fact that a lot of people reply to the OP without caring about replies from other members. This is a little unfair since every little contribution is worth to be read before evaluating that something more can be told: comments, add-ons, notes, disapprovals...

I agree with the first statement insofar that it can be frustrating, but it hasn't made me angry -- I see it as a possible form of rudeness -- possible because we don't know if it's intentional or not.

I don't agree that every reply is a worthwhile contribution, because they can be inane, erroneous, malicious, off-topic, etc.

As ghostdog74 says -- just move on, or do some yoga, get a drink of water, whatever helps you relax ... cheers, makyo

colucix 10-30-2007 10:47 AM

Got the point! You both are right! I think I get something to drink and relax now... cheers! :)

makyo 10-30-2007 11:04 AM

Hi.

Command csplit was invented to solve these kinds of problems generally:
Code:

#!/usr/bin/env sh

# @(#) s2      Demonstrate csplit, context split.

set -o nounset
echo

debug=":"
debug="echo"

## Use local command version for the commands in this demonstration.

echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version bash csplit edges my-nl

echo

# Remove previous debris.
rm -f xx*

echo " Current files, sample of final:"
ls -og final xx*

echo
edges -l 2 final

# Replace "*" with "99" for older versions of csplit.
csplit -k -q final /SERVER/ '{*}'

echo
echo " Result files and sample:"

ls xx*

my-nl xx11

exit 0

Producing:
Code:

% ./s2

(Versions displayed with local utility "version")
GNU bash 2.05b.0
csplit (coreutils) 5.2.1
edges (local) 287
my-nl (local) 294

 Current files, sample of final:
ls: xx*: No such file or directory
-rw-r--r--  1 866 Oct 29 05:38 final

    1  SERVER
    2  End of section 1
  ...
    69  SERVER
    70  End of section 35

 Result files and sample:
xx00  xx03  xx06  xx09  xx12  xx15  xx18  xx21  xx24  xx27  xx30  xx33
xx01  xx04  xx07  xx10  xx13  xx16  xx19  xx22  xx25  xx28  xx31  xx34
xx02  xx05  xx08  xx11  xx14  xx17  xx20  xx23  xx26  xx29  xx32  xx35

==> xx11 <==

  1 SERVER
  2 End of section 11

See man csplit for details ... cheers, makyo

vikas027 11-01-2007 05:52 AM

Hi All
 
Hi,

I tried lots of commands including the above ones. They all run fine on Linux machines
BUT not on the solaris machines, dont know the reason behind it.

Anyways, MANY MANY THANKS to all for your time and help. I found this command to work perfectly.

Code:

/usr/xpg4/bin/awk '/SERVER/{n++}{print > f n}' f=/vikas/list /vikas/final
Thanks again.

ghostdog74 11-01-2007 06:26 AM

Quote:

Originally Posted by vikas027 (Post 2944306)
[code]/usr/xpg4/bin/awk /code]

nawk, did you try?

vikas027 11-01-2007 06:46 AM

Quote:

Originally Posted by ghostdog74 (Post 2944332)
nawk, did you try?

yes I tried nawk (and gawk too), it was giving some error even though I had used an absolute path.


All times are GMT -5. The time now is 07:31 AM.