LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-04-2015, 08:43 AM   #1
mierdatuti
Member
 
Registered: Aug 2008
Posts: 64

Rep: Reputation: 15
Help wit awk


Hi,
I must do a script with awk by these way... I have a txt file with these format:
Code:
xxxddd
kk example
lsdjfs

sdkjlf
jfdsjlkf

kk example2 
fsdlkfj
----
sdfkjldsf

kk example3
djfjff
Well I would like to make tree files. Every file must have the content between two kk phrase like these:

file-> example
Code:
lsdjfs

sdkjlf
jfdsjlkf
file-> example2
Code:
fsdlkfj
----
sdfkjldsf
file-> example3
Code:
djfjff
I'm trying with awk parsing every line but cant works.
Code:
{
     if (keep == 0) {
          fnd=index($0,"kk");
             if (fnd) {
                #keep=1;
                if ($2) {
                        print $2;
                        var=$2;
                        print "creating file " $2>>$2
                        keep=1;
                        } 
              }else
      {
          keep=0
           }

     }
     if (keep == 1) {

          getline sig;
          print "next line ....." sig; 
          fnd=index(sig,"kk");
          if (fnd) {
            keep=0;
          }else
           print "-------------------variable is ...." sig
           print sig >> var
          } 
}
Some guru could help me please?
Thanks
 
Old 08-04-2015, 09:25 AM   #2
HMW
Member
 
Registered: Aug 2013
Location: Sweden
Distribution: Debian, Arch, Red Hat, CentOS
Posts: 773
Blog Entries: 3

Rep: Reputation: 369Reputation: 369Reputation: 369Reputation: 369
Well, if you HAVE to do this with awk, I am sure one of the 'awk gurus' (I am not a member of that specific clan) can help you out. But! If it were me, I would check out csplit, which ought to be able to do what you want.

Can't try it out for you right now since I am <rant>stuck behind one of those wondrous machines designed in Cupertino that ships with a version of csplit that lacks half of the funcionality one would EXPECT!!!</rant>

Anyway, check out this thread:
http://www.linuxquestions.org/questi...es-4175546320/

What happens if you try this command:
Code:
csplit --suppress-matched infile.txt '/^kk/' {*}
Does it do the trick?

Best regards,
HMW
 
Old 08-04-2015, 09:36 AM   #3
mierdatuti
Member
 
Registered: Aug 2008
Posts: 64

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by HMW View Post
Well, if you HAVE to do this with awk, I am sure one of the 'awk gurus' (I am not a member of that specific clan) can help you out. But! If it were me, I would check out csplit, which ought to be able to do what you want.

Can't try it out for you right now since I am <rant>stuck behind one of those wondrous machines designed in Cupertino that ships with a version of csplit that lacks half of the funcionality one would EXPECT!!!</rant>

Anyway, check out this thread:
http://www.linuxquestions.org/questi...es-4175546320/

What happens if you try this command:
Code:
csplit --suppress-matched infile.txt '/^kk/' {*}
Does it do the trick?

Best regards,
HMW
thanks, I would like to make these with AWK, meanwhile I'm trying your method:
Code:
csplit: unrecognized option '--suppress-matched'
Try `csplit --help' for more information.
[sp80439@oc6566503017 kk]$ csplit --help
Usage: csplit [OPTION]... FILE PATTERN...
Output pieces of FILE separated by PATTERN(s) to files `xx00', `xx01', ...,
and output byte counts of each piece to standard output.

Mandatory arguments to long options are mandatory for short options too.
  -b, --suffix-format=FORMAT  use sprintf FORMAT instead of %02d
  -f, --prefix=PREFIX        use PREFIX instead of `xx'
  -k, --keep-files           do not remove output files on errors
  -n, --digits=DIGITS        use specified number of digits instead of 2
  -s, --quiet, --silent      do not print counts of output file sizes
  -z, --elide-empty-files    remove empty output files
      --help     display this help and exit
      --version  output version information and exit

Read standard input if FILE is -.  Each PATTERN may be:

  INTEGER            copy up to but not including specified line number
  /REGEXP/[OFFSET]   copy up to but not including a matching line
  %REGEXP%[OFFSET]   skip to, but not including a matching line
  {INTEGER}          repeat the previous pattern specified number of times
  {*}                repeat the previous pattern as many times as possible

A line OFFSET is a required `+' or `-' followed by a positive integer.
 
Old 08-04-2015, 09:47 AM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,006

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Code:
awk 'NR>1{print > file[2]}{split(RT,file)}' ORS="" RS="kk[^\n]*\n" file
 
2 members found this post helpful.
Old 08-04-2015, 11:09 AM   #5
HMW
Member
 
Registered: Aug 2013
Location: Sweden
Distribution: Debian, Arch, Red Hat, CentOS
Posts: 773
Blog Entries: 3

Rep: Reputation: 369Reputation: 369Reputation: 369Reputation: 369
Quote:
Originally Posted by mierdatuti View Post
thanks, I would like to make these with AWK, meanwhile I'm trying your method:
Code:
csplit: unrecognized option '--suppress-matched'
Try `csplit --help' for more information.
[sp80439@oc6566503017 kk]$ csplit --help
Usage: csplit [OPTION]... FILE PATTERN...
Output pieces of FILE separated by PATTERN(s) to files `xx00', `xx01', ...,
and output byte counts of each piece to standard output.

Mandatory arguments to long options are mandatory for short options too.
  -b, --suffix-format=FORMAT  use sprintf FORMAT instead of %02d
  -f, --prefix=PREFIX        use PREFIX instead of `xx'
  -k, --keep-files           do not remove output files on errors
  -n, --digits=DIGITS        use specified number of digits instead of 2
  -s, --quiet, --silent      do not print counts of output file sizes
  -z, --elide-empty-files    remove empty output files
      --help     display this help and exit
      --version  output version information and exit

Read standard input if FILE is -.  Each PATTERN may be:

  INTEGER            copy up to but not including specified line number
  /REGEXP/[OFFSET]   copy up to but not including a matching line
  %REGEXP%[OFFSET]   skip to, but not including a matching line
  {INTEGER}          repeat the previous pattern specified number of times
  {*}                repeat the previous pattern as many times as possible

A line OFFSET is a required `+' or `-' followed by a positive integer.
Strange, it works for me. My version of csplit (using Debian 8.1):
Code:
$ csplit --version
csplit (GNU coreutils) 8.23
Copyright © 2014 Free Software Foundation, Inc.
But, my approach with csplit:
Code:
csplit --suppress-matched lqcsplit.txt '/kk/' {*}
Produces four files:
Code:
$ cat xx00
xxxddd
Code:
$ cat xx01
lsdjfs

sdkjlf
jfdsjlkf
Code:
$ cat xx02
fsdlkfj
----
sdfkjldsf
Code:
$ cat xx03
djfjff
So it's close, but no cigar. Check out grail's awk instead.

Best regards,
HMW

Last edited by HMW; 08-04-2015 at 11:11 AM.
 
Old 08-04-2015, 07:29 PM   #6
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,125

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
One of the hard things I found with learning awk (and perl) is to not try and code like you would for a "traditional" language.
awk offers a lot of facilities that help (and hide) with the mundane work that has to be done. You would do well to examine grails post - I imagine understanding it will be difficult without having the documentation handy.

Nice use of RT there grail.
 
Old 08-04-2015, 08:47 PM   #7
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,263
Blog Entries: 24

Rep: Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194
Quote:
Originally Posted by grail View Post
Code:
awk 'NR>1{print > file[2]}{split(RT,file)}' ORS="" RS="kk[^\n]*\n" file
I stand in awe...

Quote:
Originally Posted by syg00 View Post
One of the hard things I found with learning awk (and perl) is to not try and code like you would for a "traditional" language.
awk offers a lot of facilities that help (and hide) with the mundane work that has to be done. You would do well to examine grails post - I imagine understanding it will be difficult without having the documentation handy.

Nice use of RT there grail.
I have the O'Reilly sed & awk 2nd ed in hand and still had to break it down before understanding it. This is the first actual use of RT that I recall seeing and had to refer to it (page 266) twice before it sunk in! Very nice - I'll "steal" (i.e. learn from) this one - thanks!

Note: RT appears to be a gawk-ism so may not work on other awks.

Last edited by astrogeek; 08-04-2015 at 08:54 PM.
 
Old 08-04-2015, 09:44 PM   #8
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,125

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
I have books on everything - except awk. I didn't see the need initially, as I thought it was something I wouldn't get to use much.
Wrong.
I make do with the manual - people like grail keep teaching me new things all the time.
 
Old 08-04-2015, 11:56 PM   #9
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,263
Blog Entries: 24

Rep: Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194
Yea, I always buy the books...

For some reason awk has never stuck, in the sense that I know when to use it but I rarely see the intuitive path to a good awk solution quickly. As a result, I end up doing many things with sed and shell scripts that would be better done with awk. I then see an effective awk one-liner from someone like grail and cry...

I have recently been following and now participating in awk problem threads to try to remedy that.

Last edited by astrogeek; 08-05-2015 at 01:49 AM.
 
Old 08-05-2015, 01:42 AM   #10
AnanthaP
Member
 
Registered: Jul 2004
Location: Chennai, India
Posts: 952

Rep: Reputation: 217Reputation: 217Reputation: 217
If line begins with kk, the destination file name is in $2 (second word). All other lines ($0) get redirected to this file.
So
Code:
BEGIN {
 redirFile=stdout ;
}
{
 if(substr($1,1,2)=="kk" redirFile=$2 ;
 else print $0>>redirFile
}
OK

Last edited by AnanthaP; 08-05-2015 at 01:45 AM.
 
Old 08-05-2015, 02:23 AM   #11
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,263
Blog Entries: 24

Rep: Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194
Quote:
Originally Posted by AnanthaP View Post
If line begins with kk, the destination file name is in $2 (second word). All other lines ($0) get redirected to this file.
So
Code:
BEGIN {
 redirFile=stdout ;
}
{
 if(substr($1,1,2)=="kk" redirFile=$2 ;
 else print $0>>redirFile
}
Starting premis looks OK but there are problems if you test it...

Not "all other lines get redirected", only those between "kk" lines. The leading lines cause a problem with the redirect.

You are missing a closing parenthesis on the if(... clause, probably typo but should have been tested.

The definition of redirFile=stdout is invalid in the redirect, so it fails on the leading lines before the opening "kk" line.

If you fix the parenthesis and avoid the NULL redirect it works. But if you run it twice it appends to existing files instead of writing them from input, requires removal of pre-existing output files as extra step and can be confusing!

Nice try, see if you can tweak it up from here!

Last edited by astrogeek; 08-05-2015 at 02:50 AM. Reason: Fixed my confusion and typos
 
Old 08-05-2015, 05:16 AM   #12
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,006

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Quote:
Originally Posted by astrogeek
I have recently been following and now participating in awk problem threads to try to remedy that.
That and the online manual was how I got to know awk. I had never used it prior to seeing it in some posts by ghostdog when I first joined LQ
 
Old 08-05-2015, 05:51 AM   #13
AnanthaP
Member
 
Registered: Jul 2004
Location: Chennai, India
Posts: 952

Rep: Reputation: 217Reputation: 217Reputation: 217
Hi astrogooek

In post #11
  1. Yes all other lines get redirected.
  2. No starting parenthesis and hence no closing parenthesis in the if. (A single statement doesn't require ripple bracket parenthesis so long as it ends with a semi colon).
  3. Double redirection not really required in awk. The first use zaps the destination file. (By habit I wrote >>).
  4. Yet to test it out. Feel free to improve on it.
OK
 
Old 08-05-2015, 07:37 AM   #14
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,006

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Actually point 2 is correct however it is the missing round bracket to your if that astrogeek was referring to, I believe

Also, once bracket is in, testing yields:
Code:
$ ./anathap.awk op_data
awk: ./anathap.awk:8: (FILENAME=op_data FNR=1) fatal: expression for `>>' redirection has null string value
 
Old 08-05-2015, 09:33 AM   #15
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Quote:
Originally Posted by grail View Post
Code:
awk 'NR>1{print > file[2]}{split(RT,file)}' ORS="" RS="kk[^\n]*\n" file
I'm in the ditch.

I used this ...
Code:
   Path=${0%%.*}
 InFile=$Path"inp.txt"

echo; echo; echo "Method of LQ Guru grail."
# awk               'NR>1{print > file[2]}{split(RT,file)}' ORS="" RS="kk[^\n]*\n" file
awk -v file=$InFile 'NR>1{print > file[2]}{split(RT,file)}' ORS="" RS="kk[^\n]*\n" $InFile
... and got this result ...
Code:
Method of LQ Guru grail.
awk: NR>1{print > file[2]}{split(RT,file)}
awk:                    ^ use of non-array as array
Please advise.

Daniel B. Martin
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
OpenLdap wit PPolicy moinshareef Linux - Server 3 12-27-2012 10:51 PM
[SOLVED] Generate shell command wit awk cristalp Programming 7 03-21-2012 05:28 AM
Help cloning CF wit Linux mikhrus Linux - Hardware 13 05-27-2008 01:47 AM
Problem Wit Firefoxt iBOT Linux - Newbie 4 02-18-2008 09:27 AM
problem wit x or maybe xwindow? starguymike04 Debian 5 08-19-2004 12:31 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:03 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration