LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-21-2012, 10:30 AM   #1
mikes88
Member
 
Registered: Jan 2012
Posts: 61

Rep: Reputation: Disabled
Split large file into smaller files


Hello again,

ive been bashing my head on this one trying to look at different AWK scripts to split my file into smaller files based on a specific line.

At the top of every file theres a line that reads:

Code:
/ info / info / servername / info / info / info / info
then a bunch of data under it reading:

Code:
Wed Feb 01 08:51:24 EST 2012,Wed Feb 01 08:51:44 EST 2012,1098.35,
Wed Feb 01 08:51:44 EST 2012,Wed Feb 01 08:52:04 EST 2012,915.25,
Wed Feb 01 08:52:04 EST 2012,Wed Feb 01 08:52:24 EST 2012,937.7,
Wed Feb 01 08:52:24 EST 2012,Wed Feb 01 08:52:44 EST 2012,957.25,
Wed Feb 01 08:52:44 EST 2012,Wed Feb 01 08:53:04 EST 2012,703.3,
Wed Feb 01 08:53:04 EST 2012,Wed Feb 01 08:53:24 EST 2012,757.75,
Wed Feb 01 08:53:24 EST 2012,Wed Feb 01 08:53:44 EST 2012,891.8,
Wed Feb 01 08:53:44 EST 2012,Wed Feb 01 08:54:04 EST 2012,951.75,
Wed Feb 01 08:54:04 EST 2012,Wed Feb 01 08:54:24 EST 2012,997.25,
Wed Feb 01 08:54:24 EST 2012,Wed Feb 01 08:54:44 EST 2012,1018.4,
Wed Feb 01 08:54:44 EST 2012,Wed Feb 01 08:55:04 EST 2012,1046.95,
Wed Feb 01 08:55:04 EST 2012,Wed Feb 01 08:55:24 EST 2012,838.3,
Wed Feb 01 08:55:24 EST 2012,Wed Feb 01 08:56:04 EST 2012,912.05,
What i need to do is read that first line then take every line and out put it to another file named the servername. There are multiple servers in the same file that start with that first line:

Code:
/ info / info / servername / info / info / info / info
 
Old 03-21-2012, 10:40 AM   #2
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
keeping this within awk, somethign like this shoouuuuld work. Untested...
Code:
    awk -F"/" '{
      if { $4 != "" } {
         OUTFILE = $4
      } else {
         print $0 OUTFILE
      }
    }' filename
no validity checking whatsoever, but if your input is reliable, should be fine. Offhand, I'm not totally sure if the server name will be the 3rd or 4th field, as the line starts with a field seperator. I'd expect it to be the 4th though.

Last edited by acid_kewpie; 03-21-2012 at 10:43 AM.
 
Old 03-21-2012, 10:51 AM   #3
mikes88
Member
 
Registered: Jan 2012
Posts: 61

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by acid_kewpie View Post
keeping this within awk, somethign like this shoouuuuld work. Untested...
Code:
    awk -F"/" '{
      if { $4 != "" } {
         OUTFILE = $4
      } else {
         print $0 OUTFILE
      }
    }' filename
no validity checking whatsoever, but if your input is reliable, should be fine. Offhand, I'm not totally sure if the server name will be the 3rd or 4th field, as the line starts with a field seperator. I'd expect it to be the 4th though.
I tried to execute the code but was getting syntax erors...

Code:
awk: cmd. line:1:       if { $7 != "" } {
awk: cmd. line:1:          ^ syntax error
awk: cmd. line:3:       } else {
awk: cmd. line:3:         ^ syntax error
 
Old 03-21-2012, 10:53 AM   #4
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
think of it as an exercise for the reader. ;-) mixed up my TCL with my AWK...
 
Old 03-21-2012, 11:01 AM   #5
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

I do believe that the generated OUTFILE also has a nice surprise
 
Old 03-21-2012, 11:01 AM   #6
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Chris, may I correct the code?
Code:
    awk -F" / " '{
      if ( $3 != "" )
         OUTFILE = $3
      else 
         print > OUTFILE
    }' filename
Using spaces in the field separator avoids spaces in the output file name. Then basically you have to explicitly redirect the output of the print statement to the output file, using the ">" symbol as in bash (the only difference with bash redirection is that the output is appended by default).
 
Old 03-21-2012, 11:14 AM   #7
mikes88
Member
 
Registered: Jan 2012
Posts: 61

Original Poster
Rep: Reputation: Disabled
Isnt working... Just outputs to the screen and not a file.
 
Old 03-21-2012, 11:19 AM   #8
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

I just checked colucix code and it works.
 
Old 03-21-2012, 11:19 AM   #9
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Quote:
Originally Posted by mikes88 View Post
Isnt working... Just outputs to the screen and not a file.
Please, can you post an example of a real header line?
 
Old 03-21-2012, 11:21 AM   #10
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

Are you sure the first line is exactly as given in post #1? Especially the spaces used.

Here's an alternative to tackling the separator issue:
Code:
#!/bin/bash

awk -F"/" '
/info/ { gsub(/[ ]+/,"",$0)    }
{
  if ( $4 != "" ) {
    OUTFILE = $4
  } else {
    print $0 > OUTFILE
  }
}' filename
Hope this helps.
 
Old 03-21-2012, 11:32 AM   #11
mikes88
Member
 
Registered: Jan 2012
Posts: 61

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by druuna View Post
Hi,

Are you sure the first line is exactly as given in post #1? Especially the spaces used.

Here's an alternative to tackling the separator issue:
Code:
#!/bin/bash

awk -F"/" '
/info/ { gsub(/[ ]+/,"",$0)    }
{
  if ( $4 != "" ) {
    OUTFILE = $4
  } else {
    print $0 > OUTFILE
  }
}' filename
Hope this helps.
The spaces are exact. / info info / info / servername / info / info / info / info
(sorry missed a word)

getting the following error:

Code:
awk: cmd. line:6: (FILENAME=testing-split.csv FNR=1) fatal: expression for `>' redirection has null string value

Last edited by mikes88; 03-21-2012 at 11:33 AM.
 
Old 03-21-2012, 11:44 AM   #12
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
This error would tell us that the if has not been entered and hence OUTFILE variable is empty.
 
Old 03-21-2012, 12:11 PM   #13
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Please, post your current code and a REAL header line.
 
Old 03-21-2012, 12:38 PM   #14
mikes88
Member
 
Registered: Jan 2012
Posts: 61

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by colucix View Post
Please, post your current code and a REAL header line.
Code:
awk -F"/" '
"/ Toronto MDS_GW2 /" { gsub(/[ ]+/,"",$0)    }
{
  if ( $3 != "" ) {
    OUTFILE = $3
  } else {
    print $0 > OUTFILE
  }
}' testing-split.csv
header

Code:
/ Toronto MDS_GW2 / 10_10_10_10_7036 / tsr2 / service(type=dist) / service / SELECTFEED / Update
 
Old 03-21-2012, 12:50 PM   #15
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

Have a look at this
Code:
#!/bin/bash

awk -F"/" '
/\// { gsub(/[ ]+/,"",$0) }
{
  if ( $4 != "" ) {
    OUTFILE = $4
  } else {
    print $0 > OUTFILE
  }
}' testing-split.csv
I'm assuming that Toronto MDS_GW2 is not unique, so I used /\// which looks for a forward slash (which needs to be escaped, it has special meaning in awk). I also believe $3 should be $4.
 
1 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Split a large hard drive into smaller drives jayhall Ubuntu 4 12-06-2011 11:26 AM
Split large file in several files using scripting (awk etc.) chipix Programming 14 10-29-2007 11:16 AM
how to sort text file and split into smaller files michaeljoser Linux - Software 8 10-19-2007 01:50 AM
Split a large file and get the names of output files using Perl Sherlock Programming 25 02-02-2007 12:43 PM
Compress and split a big sized file into smaller files hicham007 Programming 3 07-28-2005 08:56 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 11:29 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration