LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 09-25-2012, 01:50 AM   #16
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235

Trd300: Do you want to read the file as you send data to it? Perhaps you need to use a pipe instead. (See man mkfifo).
 
Old 09-25-2012, 02:15 AM   #17
AnanthaP
Member
 
Registered: Jul 2004
Location: Chennai, India
Posts: 952

Rep: Reputation: 217Reputation: 217Reputation: 217
Not tried it out yet, but I might read and process output1.txt on the END pattern. (The basic call remaining as
Quote:
gawk -f myprog.awk input1.txt
.)

Also, I would generate records in output1.txt iff (NR%2 == 0) and in this case generate as many lines as the length($3). Note that this would handle the missing last line in your output.

OK
 
Old 09-25-2012, 03:48 AM   #18
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
Quote:
Do you want to read the file as you send data to it?
konsolebox: the goal would be to convert step-wise algorithms (e.g. "1.awk" and "2.awk") into one single script "myprog.awk".

Single script "myprog.awk":
Code:
original input ---> myprog.awk ---> final output
Step-wise algorithm:
Code:
original input ---> 1.awk ---> output1 ---> 2.awk ---> final output

(output1 being the file that I am trying to redirect inside myprog.awk)
Do you see my point?
Do I really need to use named pipe to do that?

AnanthaP: I tried konsolebox's 2nd strategy (using the END section), but it didn't change anything.
 
Old 09-25-2012, 03:59 AM   #19
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
@Trd300: If you do want to run 1.awk and 2.awk simultaneously you do would need a pipe or some runtime medium for that. Using an ordinary file (in this case, that output1 file) won't work. I mean, both awk files opening the file at the same time as output and intput won't. It would if you run 1.awk first then 2.awk afterwards. I'm not sure if there are techniques to do that on an ordinary file but generally I know there's none, or perhaps it would be hard in awk. In C opened as binary maybe.

Anyway perhaps you're just giving that an example to show what you really want (myprog.awk) so how about grail's suggestion of using arrays? Do you still *need the in-between operation output1 file to be used for later operations?

Last edited by konsolebox; 09-25-2012 at 04:01 AM.
 
Old 09-25-2012, 04:34 AM   #20
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
Actually the step-wise algorithm looks more like that:
Code:
original input ---> 1.awk ---> output1
        output1 ---> 2.awk ---> final output


OK, I see what you mean.

I am not sure arrays will fix the problem. In myprog.awk a first function1 will produce results1, and a different function2 will produce results2. As I cannot assign the same variable for different values (results1 & results2) I though redirecting results1 and results 2 in the same file by concatenating them would sort the problem out:

Code:
BEGIN{}

<define function1 here>

<define function2 here>


{print function1($X) > output1.txt}

{print function2($X) >> output1.txt}

{    close("output1.txt");
     RS=ORS="\n";  
     while((getline < "output1.txt") > 0){<keep working on "output1.txt" as an input>}


END{}
I am gonna try using getline from a coprocess, although I don't know if there is a way to concatenate the results of the different functions.
 
Old 09-25-2012, 04:51 AM   #21
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,006

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
I must be reading this all wrong as I am getting very lost now

If we assume the 1.awk / 2.awk approach, my understanding is that you would create a temporary file after running 1.awk on the original input and then run 2.awk on the temporary
file to produce the final output. Is this correct?

If above is correct, is it not simply a case of first performing the necessary tasks on the original data and then any follow up tasks to produce the desired output?

Again I would request a before an after picture of data? It seems to me you may be trying to place the square peg in round hole when it is not necessarily the process you should be using.
 
Old 09-25-2012, 05:47 AM   #22
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
Quote:
If we assume the 1.awk / 2.awk approach, my understanding is that you would create a temporary file after running 1.awk on the original input and then run 2.awk on the temporary
file to produce the final output. Is this correct?
Yes it is correct.

Quote:
If above is correct, is it not simply a case of first performing the necessary tasks on the original data and then any follow up tasks to produce the desired output?
Yes it is a case like that.


Code:
original input ---> function1---> results1
                                           ----> concatenate results1 & 2 ---> process ---> final output
               ---> function2---> results2

Last edited by Trd300; 09-25-2012 at 05:53 AM.
 
Old 09-25-2012, 06:07 AM   #23
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
Does that mean input is read by two functions twice (one file at a time), or twice by line? How bout the concatenated output as well?
 
Old 09-25-2012, 03:55 PM   #24
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,187

Rep: Reputation: 354Reputation: 354Reputation: 354Reputation: 354
If you're using a version of gawk that supports it (Version 4 does; I'm not sure about version 3), you could consider something like this:
Code:
BEGIN {
# Expand the argument list so each input file name is duplicated:
  for (i=1; i<ARGC; ++i) {
# Is this a valid (readable) file?
    if ((getline test < ARGV[i]) > 0) {
      close(ARGV[i])
      for (j=ARGC;j>i;--j) {
        ARGV[j]=ARGV[j-1]
      }
      ++ARGC
      ++i # So the outer loop skips the duplicate we've added . . .
    }
   }
   process_count=0
}
BEGINFILE {
# Is this a readable file?
  if (ERRNO != 0) {
#   Process the non-file value.
    nextfile
  }
  ++process_count
}
process_count==1 {
# Do the stuff for the first pass through the file . . .
}
process_count=2 {
# Do your thing for the second pass through the file . . .
}
ENDFILE {
  if (process_count==2) {
    process_count=0
#   Any other EOF processing you want . . .
  }
}
END {
# Final clean-up and termination processing . . .
}
 
Old 09-25-2012, 04:36 PM   #25
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
Actually if it's per-line basis, Trd300 could just use the variable ($0 or other) that stores the input twice and pass it to two functions. If it's a per-file basis, he/she could read the file twice with:
Code:
while (getline < input) {
    # ...
}

close(input)

while (getline < input) {
    # ...
}

close(input)
The latter is to be based from my suggestion with only using the BEGIN block.
 
Old 09-25-2012, 08:19 PM   #26
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
Here is an example I've seen on the web.

input:
Code:
@XXXXXX|YYY
12345678
...
First, writing the numbers on the same line as the preceding record separated by a pipe (and remove the "@"):
Code:
XXXXXX|YYY|12345678
...
To do that, set the RS as "@" and delete the "\n".

Then we use 2 functions:
function1: convert block of 2 numbers to letters (according to a conversion array)
function2: reverse the string of numbers

1) From the original input file , using function1, convert block of 2 numbers to letters starting from the 1st letter, then the 2nd, then the 3rd,...until the end of the string.
2) Then always with the same input, using function2, reverse the original string of numbers and do like 1) to it.
3) concatenate the results of 1) with the results of 2) in the same output (in which we removed $2), to get this intermediate file:
Code:
XXXXXX|aceg      # start from 1st number (i.e. 12345678)
XXXXXX|bdfx      # start from 2nd number (i.e. 2345678)
XXXXXX|ceg       # start from 3rd number (i.e. 345678)
XXXXXX|dfx        # start from 4th number (i.e. 45678)
XXXXXX|eg         # start from 5th number (i.e. 5678)
XXXXXX|fx          # start from 6th number (i.e. 678)
XXXXXX|g           # start from 7th number (i.e. 78)
XXXXXX|x           # start from last number (i.e. 8)
XXXXXX|hjln       # same but after reversing the string starting from 1st number (i.e. 87654321)
XXXXXX|ikmx      # same but after reversing the string starting from 2nd number (i.e. 7654321)
etc...
4) Keep processing the intermediate file (e.g. keep the strings with more than 2 letters, or with a specific letter,...)



Here is how I tried to do:

Code:
BEGIN{
         RS="@"; FS=OFS="|"; conv["12"]="a"; conv["23"]="b"; conv["34"]="c"; conv["45"]="d"; conv["56"]="e"; conv["67"]="f"; conv["78"]="g";
         conv["87"]="h"; conv["76"]="i"; conv["65"]="j"; conv["54"]="k"; conv["43"]="l"; conv["32"]="m"; conv["21"}="n"
         }

function convert(field, start){
         letter = ""
         block = substr (field, start, 2)
         while (block != ""){
              letter = letter (block in conv ? conv[block] : "x")
              start = start + 2
              block = substr (field, start, 2)
         }
         return letter
}

function rev(field){
         rever = ""
         l = length(field)
         for (i=l; 0<i; i--){
              rever = rever substr (field, i, 1)
         }
         return rever
}      



NR==1{next}

NR>1{
          sub("\n", "|")       # write second line next to the preceding one
          gsub("\n", "")
         }

{
     for(i=1; i<=(length($3); i++){                                            
          print $1 FS convert($3, i) > "intermediate.txt"    # step 1) and output in a file (we removed $2)
     }
     
     for(i=1; i<=(lentgh($3); i++){
          print $1 FS convert(rev($3), i) >> "intermediate.txt"    # step 2) (we removed $2) and 3) concatenate in the same file
     }
}

##### BLOCK BELOW DOESN'T WORK ######

{
     close("intermediate.txt");
     RS=ORS="\n"; FS=OFS="|";                 # re-define RS, FS to be able to use "intermediate.txt" as if it was the input of a second command-line
     while((getline < "intermediate.txt") > 0){
           if(length($2) > 2) {print $0}          # note that previous $3 in original input becomes $2 in "intermediate.txt"
           else{next}
  
           ... <keep processing "intermediate.txt">

}

Last edited by Trd300; 09-25-2012 at 09:13 PM.
 
Old 09-25-2012, 08:30 PM   #27
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
Code:
{
     for(i=1; i<=(length($3); i++){                                            
          print $1 FS convert($3, i) > "intermediate.txt"    # step 1) and output in a file (we removed $2)
     }
     
     for(i=1; i<=(lentgh($3); i++){
          print $1 FS convert(rev($3), i) >> "intermediate.txt"    # step 2) (we removed $2) and 3) concatenate in the same file
     }
}
For that I think you should use >> as well for the first step, but you truncate the file intermediate.txt in the BEGIN block, but only if it doesn't work - that is, if the file is truncated back when first step is encountered.
 
Old 09-25-2012, 08:40 PM   #28
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
When I delete the "while((getline ...)" block after redirecting the output to "intermediate.txt" for the second time, the file contains the correct data.

If I do the same with ">>" at the first redirection, the file contains the data in duplicate.

The last block is the issue !
 
Old 09-25-2012, 08:55 PM   #29
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
Sorry. I try to examine the whole thread but it's still not apparent what is the ~final~ output that you really want to have. We could help better if we know that. It's somehow confusing to comply with the procedures at hand.

---- Add ----

I mean at least we need a real example output from original form to final.
 
Old 09-25-2012, 09:22 PM   #30
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
I understand it can bee confusing.
Starting from my last post with the code explain pretty much everything. You don't need to look before this post.

input:
Code:
@XXXXXX|YYY
12345678
"intermediate.txt":
Code:
##### Results from the first call of the function ######
XXXXXX|aceg      # start from 1st number (i.e. 12345678)
XXXXXX|bdfx      # start from 2nd number (i.e. 2345678)                                                               
XXXXXX|ceg       # start from 3rd number (i.e. 345678)
XXXXXX|dfx        # start from 4th number (i.e. 45678)
XXXXXX|eg         # start from 5th number (i.e. 5678)
XXXXXX|fx          # start from 6th number (i.e. 678)
XXXXXX|g           # start from 7th number (i.e. 78)
XXXXXX|x           # start from last number (i.e. 8)
###### Results from the second call of the function after reversing the string ######
XXXXXX|hjln       # same but after reversing the string starting from 1st number (i.e. 87654321)
XXXXXX|ikmx      # same but after reversing the string starting from 2nd number (i.e. 7654321)
etc...                   # same as previous line until the end of the reverse string
final output (if, in the last block of the code when I redirect "intermediate.txt" as the new input, I want to keep $2 > 2 letters long for instance):
Code:
XXXXXX|aceg      # start from 1st number (i.e. 12345678)
XXXXXX|bdfx      # start from 2nd number (i.e. 2345678)
XXXXXX|ceg       # start from 3rd number (i.e. 345678)
XXXXXX|dfx        # start from 4th number (i.e. 45678)
XXXXXX|hjln       # same but after reversing the string starting from 1st number (i.e. 87654321)
XXXXXX|ikmx      # same but after reversing the string starting from 2nd number (i.e. 7654321)
etc...
The problem is the transition between the block when I use the functions and concatenate both results and the block when I want to use "intermediate.txt" as a new input.

Last edited by Trd300; 09-25-2012 at 09:24 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
awk: how can I print out a message to the screen when redirecting the output to file. quanba Programming 8 07-13-2015 01:54 AM
[awk script] Help me delete lines in a file using script ? sieukid Programming 5 03-20-2012 01:23 PM
[SOLVED] awk or sed to use CSV as input and XML as template and output to a single file bridrod Linux - Newbie 6 03-13-2012 07:00 PM
Using file content as input for awk search patterns srn Programming 2 09-13-2011 02:49 AM
Plz tell me, how to get input in awk script intikhabalam Linux - General 1 07-27-2008 07:01 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 05:14 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration