[SOLVED] ? How to make AWK do *something* without giving it any stdin nor inputfile? Can it be done?

GrapefruiTgirl · 10-10-2010, 12:54 PM

QUESTION 1:

Take for example this code:

Code:

df | awk '{print $1}'
Filesystem
/dev/root
tmpfs
/dev/hda12

So I've given the output from `df` on AWK's stdin. But what I wonder is if there's a way to get AWK to run `df` itself, produce the same output, and exit?
Doesn't seem to be that simple. Here's some examples:

Code:

# Works for some reason, and I think only in Bash (nothing is required in the $() :)

awk '{ while (("df" | getline result) > 0 ){ split(result,results); print results[1] }; exit  }' <<< $()

Code:

# Works, but has a useless echo and pipe:

echo | awk '{ while (("df" | getline result) > 0 ){ split(result,results); print results[1] }; exit  }'

Code:

# Works but requires a keypress before it will run, and outputs that keypress:

awk '{ while (("df" | getline result) > 0 ){ split(result,results); print results[1] }; exit  }'

Code:

# still require input & ENTER key, and prints that input with output; also it prints whole input record, not $1:

 awk '{ while ((getline line < system("df") > 0)) { print $1 }; exit}'
 awk '{ while ((getline result < system("df") > 0)) { split(result,results); print results[1] }; exit}'

Code:

# works great (again probably only Bash).. (wondering how this differs from a pipe into stdin?):

 awk '{ print $1 }' <<<"$(df)"

So basically, does AWK absolutely need *something* on stdin, before it begins to process the data? Can it be made to open a file or stream internally, act upon that as though it were the stdin, and exit?

This is more a curiosity than anything - I don't mind reading, so if it's not a simple "yes/no" and there's something to read on this, a link or two is just peachy!

Thanks.

mesiol · 10-10-2010, 01:04 PM

Hi,

did you check awk man page?

----- quote from awk man page ----------
gawk [ POSIX or GNU style options ] -f program-file [ -- ] file
----- end of quote ---------------------

As you can see, awk will work on files. There is no need for any kind of standard input to get awk work.

GrapefruiTgirl · 10-10-2010, 01:08 PM

mesiol, thanks.

I'm familiar with the man page, yes

but have not (yet) found a way to make the -f <program-file> thing, and/or -- file thing, apply to this situation. That still requires an input file or program file. I don't want any files involved.

Maybe I didn't make my OP clear enough. I don't want to give anything on stdin, AND I don't want to give any external files.

Am I missing something?

J_Szucs · 10-10-2010, 02:32 PM

In order to spare you the ENTER key :
awk 'BEGIN { while ((getline line < system("df") > 0)) { print $1 }; exit}'

The merit is still yours for the code, I have just added the BEGIN keyword

grail · 10-10-2010, 08:01 PM

J has given the solution but thought I would just throw an alternative to think about as well

Code:

awk 'BEGIN{while("df" |& getline)print $1}'

ghostdog74 · 10-10-2010, 08:27 PM

Code:

 awk 'BEGIN{ while( ("df"|getline) >0 ){ print $1} }'

when you do a getline by itself, its still $0

konsolebox · 10-10-2010, 09:49 PM

The "df" command string is put in a hash so that when it's first opened, a new fd is presented to it,.. so it's not really opened more than once in a loop unless you cast close().

More specifically with something that's similar to C styles it can be like this:

Code:

awk '
    BEGIN {
        cmd = "df"

        while (cmd | getline) {
            print $1
        }

        close(cmd)

        exit
    }
'

GrapefruiTgirl · 10-11-2010, 05:33 AM

Thanks to J_Szucs (post #4) for the idea to add BEGIN, that solved the problem of requiring input to make the program start. But my not-so-merit-worthy code in that post, printed the entire input record out in one shot, seemingly when it was all read in (and I suppose, when the file descriptor got closed), so my using $1 in that code didn't work -- maybe this is what ghostdog74 was saying in post #6?:

Code:

when you do a getline by itself, its still $0

ghostdog74 can you expand on that a little?

Posts #5 and #6 (ghostdog + grail) both illustrate exactly what I was trying to figure out; I dunno how I managed to overlook the relative simplicity of the commands during my experimentation! Only difference between post #5 and #6 is post #5 uses the "co-process" pipe instead of the usual pipe. The man page says the |& coprocess symbol is not valid in --posix mode, let's see:

Code:

sasha@reactor: awk --posix 'BEGIN{ while ("df"|&getline){ print $2} }'
awk: BEGIN{ while ("df"|&getline){ print $2} }
awk:                    ^ syntax error
sasha@reactor: awk --posix 'BEGIN{ while ("df"|getline){ print $2} }'
1K-blocks
15116836
2028924
21164916
sasha@reactor:

Right, so the regular pipe works in POSIX mode.

QUESTION: what is the real difference(s) between | and |& in this command? (You don't have to answer, I'll look it up!)

konsolebox: your post shows an example of how/when the close() statement is used; I had been wondering why there is a close() but there is no open() command. But I still wonder, does it *need* to be used, particularly in the context of this thread? What if it's not used? Any difference anywhere, as far as memory use or anything? And, I don't understand the bit about 'hashes' -- neither in this context, or perhaps any other context. I slept in on the day they taught us about hashes.

My next post will demonstrate what I'm getting at with all this. Thanks for everyone's input thus far.

ghostdog74 · 10-11-2010, 05:40 AM

Quote:

Originally Posted by GrapefruiTgirl

ghostdog74 can you expand on that a little?

what i mean is, when you do the getline, you don't assign it to variable

Code:

while ( .. | getline )

So by default, the current line will be $0 (basically just like in Perl, by default the current line is $_ .)
From there, you can call your fields like normal. $1,$2 ....

GrapefruiTgirl · 10-11-2010, 05:53 AM

Ahh, yes thanks, I understand. As opposed to this:

Code:

 awk --posix 'BEGIN{ while ("df"|getline line){ split(line,parts); print parts[2]} }'

Where I assign each current line to the $line variable. Now, parts[2] is the same as what $2 would normally be.

And.. My next post will not yet demonstrate where I'm going with this.

GrapefruiTgirl · 10-11-2010, 06:36 AM

So despite that this code is working on the commandline, it doesn't work as a script.
I take that back - it *does* work as a script, but to do it in a loop, I need to close() the fd after each run, otherwise, I suppose what happens if the thing runs once, the fd remains open but there's no more data, and the program just sits there. Or, it just runs once and exits. This works anyhow:

Code:

#!/usr/bin/awk -f
# This runs only once and exits, unless the close() statement is uncommented, and then it works great:
BEGIN{
   cmd="df"
   for (x=0; x<= 1000; x++) {
       while (cmd|getline) { print $2 }
#  close(cmd)
   }
}

So, the whole idea here, is to see if using awk+df in this way, would be faster than using bash to pipe `df` into `awk`.. Some highly scientific tests follow:

Code:

#!/bin/bash
# time_bash.sh
for ((x=0; x<=10000; x++)); do
    df | awk '{ print $2 }'
done

real    0m22.774s
user    0m4.696s
sys     0m19.455s

Code:

#!/bin/dash
# time_dash.sh
for ((var=0; var<=10000; var++)); do
    df | awk '{ print $2 }'
done

# Wtf.. Doesn't run? It may be early but this is perplexing me:
./time_dash: 3: Syntax error: Bad for loop variable

# What do you mean, "bad for loop variable"? Pfft..

Code:

#!/bin/dash
# this works..
var=0
while [ $var -le 10000 ]; do
    df | awk '{ print $2 }'
    var=$((var+1))
done
real    0m21.367s
user    0m5.276s
sys     0m12.338s
sasha@reactor:

Code:

#!/usr/bin/awk -f
# time_awk.sh
BEGIN{
   cmd="df"
   for (x=0; x<=10000; x++) {
       while (cmd|getline) { print $2 }
   close(cmd)
   }
}

real    0m29.776s
user    0m5.963s
sys     0m16.284s

Results: I'm surprised! Using awk alone appears to consume more time than using either shell. I wonder why this is...

Well, the reason for this thread was "why didn't the command line of awk work, with no stdin nor files", and that has been solved thanks to those contributors above.

Thanks for this!

Even though I'll mark it solved, further input on anything in here will be welcome.

grail · 10-11-2010, 08:08 AM

Quote:

./time_dash: 3: Syntax error: Bad for loop variable

Does dash support this for loop construct, I only know it from bash:

Code:

for ((var=0; var<=10000; var++)); do

What happens if you leave the close(cmd) out?
It may also pay to look at flushing after you close / don't close as well to see if that helps

Code:

fflush("df")

GrapefruiTgirl · 10-11-2010, 08:22 AM

Well, maybe I'm just having a really off sort of day.. Actually I don't feel great, maybe getting a cold, so if I'm missing dumb obvious things today, that's my excuse.

As for "does dash support that loop?", well as far as I knew until right now, it did. I've written most of my scripts to run in any shell, and I'd be 100% surprised if I don't have a single working loop in some of my scripts. But, I just created a simple dash script with a loop and an echo statement, and it failed to run the same way:

Code:

sasha@reactor: ./dash_loop
./dash_loop: 3: Syntax error: Bad for loop variable
sasha@reactor: cat dash_loop
#!/bin/dash

for ((x=0;x<=5;x++)); do
  echo hello
done

sasha@reactor:

With single brackets, double brackets, square brackets, and with no brackets at all, it won't run either.

I'm baffled. Here's the manpage for dash:

Quote:

The syntax of the while command is

while list
do list
done

The two lists are executed repeatedly while the exit status of the first list is zero. The until command is similar, but has the word until in
place of while, which causes it to repeat until the exit status of the first list is zero.

The syntax of the for command is

for variable [ in [ word ... ] ]
do list
done

So, no mention of the standard for-loop. Weird. I just can't believe it.

I'll check what the usage of fflush() does in my test programs above and update again in a bit.

GrapefruiTgirl · 10-11-2010, 08:31 AM

Haven't checked fflush() effects yet, but here's the simple workaround for lack of a for loop, using `seq` instead:

Code:

#!/bin/dash

for x in $(seq 1 100); do
  echo hello
done

Though I still cannot believe the familiar construct earlier doesn't work.

konsolebox · 10-11-2010, 08:48 AM

@GrapefruiTgirl, @grail I think by default awk closes all file descriptors that are still open when it exits. It's just a way of demonstrating how opening and closing of command pipes works in awk.

@GrapefruiTgirl I sometimes call associative arrays as hashes or vice versa. So with internal hashes in awk, any command string can be associated with a single value.. a single fd that is.

Here's an example awk script that necessarily requires close():

http://www.linuxquestions.org/questi...0/#post4066219