LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 10-10-2010, 12:54 PM   #1
GrapefruiTgirl
Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550
? How to make AWK do *something* without giving it any stdin nor inputfile? Can it be done?


QUESTION 1:

Take for example this code:
Code:
df | awk '{print $1}'
Filesystem
/dev/root
tmpfs
/dev/hda12
So I've given the output from `df` on AWK's stdin. But what I wonder is if there's a way to get AWK to run `df` itself, produce the same output, and exit?
Doesn't seem to be that simple. Here's some examples:
Code:
# Works for some reason, and I think only in Bash (nothing is required in the $() :)

awk '{ while (("df" | getline result) > 0 ){ split(result,results); print results[1] }; exit  }' <<< $()
Code:
# Works, but has a useless echo and pipe:

echo | awk '{ while (("df" | getline result) > 0 ){ split(result,results); print results[1] }; exit  }'
Code:
# Works but requires a keypress before it will run, and outputs that keypress:

awk '{ while (("df" | getline result) > 0 ){ split(result,results); print results[1] }; exit  }'
Code:
# still require input & ENTER key, and prints that input with output; also it prints whole input record, not $1:

 awk '{ while ((getline line < system("df") > 0)) { print $1 }; exit}'
 awk '{ while ((getline result < system("df") > 0)) { split(result,results); print results[1] }; exit}'
Code:
# works great (again probably only Bash).. (wondering how this differs from a pipe into stdin?):

 awk '{ print $1 }' <<<"$(df)"
So basically, does AWK absolutely need *something* on stdin, before it begins to process the data? Can it be made to open a file or stream internally, act upon that as though it were the stdin, and exit?

This is more a curiosity than anything - I don't mind reading, so if it's not a simple "yes/no" and there's something to read on this, a link or two is just peachy!

Thanks.

Last edited by GrapefruiTgirl; 10-10-2010 at 01:45 PM.
 
Old 10-10-2010, 01:04 PM   #2
mesiol
Member
 
Registered: Nov 2008
Location: Lower Saxony, Germany
Distribution: CentOS, RHEL, Solaris 10, AIX, HP-UX
Posts: 731

Rep: Reputation: 137Reputation: 137
Hi,

did you check awk man page?

----- quote from awk man page ----------
gawk [ POSIX or GNU style options ] -f program-file [ -- ] file
----- end of quote ---------------------

As you can see, awk will work on files. There is no need for any kind of standard input to get awk work.
 
Old 10-10-2010, 01:08 PM   #3
GrapefruiTgirl
Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Original Poster
Rep: Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550
mesiol, thanks.

I'm familiar with the man page, yes but have not (yet) found a way to make the -f <program-file> thing, and/or -- file thing, apply to this situation. That still requires an input file or program file. I don't want any files involved.

Maybe I didn't make my OP clear enough. I don't want to give anything on stdin, AND I don't want to give any external files.

Am I missing something?
 
Old 10-10-2010, 02:32 PM   #4
J_Szucs
Senior Member
 
Registered: Nov 2001
Location: Budapest, Hungary
Distribution: SuSE 6.4-11.3, Dsl linux, FreeBSD 4.3-6.2, Mandrake 8.2, Redhat, UHU, Debian Etch
Posts: 1,126

Rep: Reputation: 58
In order to spare you the ENTER key :
awk 'BEGIN { while ((getline line < system("df") > 0)) { print $1 }; exit}'

The merit is still yours for the code, I have just added the BEGIN keyword

Last edited by J_Szucs; 10-10-2010 at 02:37 PM.
 
1 members found this post helpful.
Old 10-10-2010, 08:01 PM   #5
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,577

Rep: Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941
J has given the solution but thought I would just throw an alternative to think about as well
Code:
awk 'BEGIN{while("df" |& getline)print $1}'
 
1 members found this post helpful.
Old 10-10-2010, 08:27 PM   #6
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
Code:
 awk 'BEGIN{ while( ("df"|getline) >0 ){ print $1} }'
when you do a getline by itself, its still $0

Last edited by ghostdog74; 10-10-2010 at 08:31 PM.
 
1 members found this post helpful.
Old 10-10-2010, 09:49 PM   #7
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 15

Rep: Reputation: 233Reputation: 233Reputation: 233
The "df" command string is put in a hash so that when it's first opened, a new fd is presented to it,.. so it's not really opened more than once in a loop unless you cast close().

More specifically with something that's similar to C styles it can be like this:
Code:
awk '
    BEGIN {
        cmd = "df"

        while (cmd | getline) {
            print $1
        }

        close(cmd)

        exit
    }
'
 
1 members found this post helpful.
Old 10-11-2010, 05:33 AM   #8
GrapefruiTgirl
Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Original Poster
Rep: Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550
Thanks to J_Szucs (post #4) for the idea to add BEGIN, that solved the problem of requiring input to make the program start. But my not-so-merit-worthy code in that post, printed the entire input record out in one shot, seemingly when it was all read in (and I suppose, when the file descriptor got closed), so my using $1 in that code didn't work -- maybe this is what ghostdog74 was saying in post #6?:
Code:
when you do a getline by itself, its still $0
ghostdog74 can you expand on that a little?

Posts #5 and #6 (ghostdog + grail) both illustrate exactly what I was trying to figure out; I dunno how I managed to overlook the relative simplicity of the commands during my experimentation! Only difference between post #5 and #6 is post #5 uses the "co-process" pipe instead of the usual pipe. The man page says the |& coprocess symbol is not valid in --posix mode, let's see:
Code:
sasha@reactor: awk --posix 'BEGIN{ while ("df"|&getline){ print $2} }'
awk: BEGIN{ while ("df"|&getline){ print $2} }
awk:                    ^ syntax error
sasha@reactor: awk --posix 'BEGIN{ while ("df"|getline){ print $2} }'
1K-blocks
15116836
2028924
21164916
sasha@reactor:
Right, so the regular pipe works in POSIX mode.

QUESTION: what is the real difference(s) between | and |& in this command? (You don't have to answer, I'll look it up!)

konsolebox: your post shows an example of how/when the close() statement is used; I had been wondering why there is a close() but there is no open() command. But I still wonder, does it *need* to be used, particularly in the context of this thread? What if it's not used? Any difference anywhere, as far as memory use or anything? And, I don't understand the bit about 'hashes' -- neither in this context, or perhaps any other context. I slept in on the day they taught us about hashes.

My next post will demonstrate what I'm getting at with all this. Thanks for everyone's input thus far.
 
Old 10-11-2010, 05:40 AM   #9
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
Quote:
Originally Posted by GrapefruiTgirl View Post
ghostdog74 can you expand on that a little?
what i mean is, when you do the getline, you don't assign it to variable
Code:
while ( .. | getline )
So by default, the current line will be $0 (basically just like in Perl, by default the current line is $_ .)
From there, you can call your fields like normal. $1,$2 ....
 
Old 10-11-2010, 05:53 AM   #10
GrapefruiTgirl
Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Original Poster
Rep: Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550
Ahh, yes thanks, I understand. As opposed to this:
Code:
 awk --posix 'BEGIN{ while ("df"|getline line){ split(line,parts); print parts[2]} }'
Where I assign each current line to the $line variable. Now, parts[2] is the same as what $2 would normally be.

And.. My next post will not yet demonstrate where I'm going with this.
 
Old 10-11-2010, 06:36 AM   #11
GrapefruiTgirl
Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Original Poster
Rep: Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550
testing: "df | awk" in the shell, vs awk -f "(df | getline)"


So despite that this code is working on the commandline, it doesn't work as a script.

I take that back - it *does* work as a script, but to do it in a loop, I need to close() the fd after each run, otherwise, I suppose what happens if the thing runs once, the fd remains open but there's no more data, and the program just sits there. Or, it just runs once and exits. This works anyhow:
Code:
#!/usr/bin/awk -f
# This runs only once and exits, unless the close() statement is uncommented, and then it works great:
BEGIN{
   cmd="df"
   for (x=0; x<= 1000; x++) {
       while (cmd|getline) { print $2 }
#  close(cmd)
   }
}
So, the whole idea here, is to see if using awk+df in this way, would be faster than using bash to pipe `df` into `awk`.. Some highly scientific tests follow:
Code:
#!/bin/bash
# time_bash.sh
for ((x=0; x<=10000; x++)); do
    df | awk '{ print $2 }'
done

real    0m22.774s
user    0m4.696s
sys     0m19.455s
Code:
#!/bin/dash
# time_dash.sh
for ((var=0; var<=10000; var++)); do
    df | awk '{ print $2 }'
done

# Wtf.. Doesn't run? It may be early but this is perplexing me:
./time_dash: 3: Syntax error: Bad for loop variable

# What do you mean, "bad for loop variable"? Pfft..
Code:
#!/bin/dash
# this works..
var=0
while [ $var -le 10000 ]; do
    df | awk '{ print $2 }'
    var=$((var+1))
done
real    0m21.367s
user    0m5.276s
sys     0m12.338s
sasha@reactor:
Code:
#!/usr/bin/awk -f
# time_awk.sh
BEGIN{
   cmd="df"
   for (x=0; x<=10000; x++) {
       while (cmd|getline) { print $2 }
   close(cmd)
   }
}

real    0m29.776s
user    0m5.963s
sys     0m16.284s
Results: I'm surprised! Using awk alone appears to consume more time than using either shell. I wonder why this is...

Well, the reason for this thread was "why didn't the command line of awk work, with no stdin nor files", and that has been solved thanks to those contributors above. Thanks for this!

Even though I'll mark it solved, further input on anything in here will be welcome.
 
Old 10-11-2010, 08:08 AM   #12
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,577

Rep: Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941
Quote:
./time_dash: 3: Syntax error: Bad for loop variable
Does dash support this for loop construct, I only know it from bash:
Code:
for ((var=0; var<=10000; var++)); do
What happens if you leave the close(cmd) out?
It may also pay to look at flushing after you close / don't close as well to see if that helps
Code:
fflush("df")
 
Old 10-11-2010, 08:22 AM   #13
GrapefruiTgirl
Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Original Poster
Rep: Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550
Well, maybe I'm just having a really off sort of day.. Actually I don't feel great, maybe getting a cold, so if I'm missing dumb obvious things today, that's my excuse.

As for "does dash support that loop?", well as far as I knew until right now, it did. I've written most of my scripts to run in any shell, and I'd be 100% surprised if I don't have a single working loop in some of my scripts. But, I just created a simple dash script with a loop and an echo statement, and it failed to run the same way:
Code:
sasha@reactor: ./dash_loop
./dash_loop: 3: Syntax error: Bad for loop variable
sasha@reactor: cat dash_loop
#!/bin/dash

for ((x=0;x<=5;x++)); do
  echo hello
done

sasha@reactor:
With single brackets, double brackets, square brackets, and with no brackets at all, it won't run either.
I'm baffled. Here's the manpage for dash:
Quote:
The syntax of the while command is

while list
do list
done

The two lists are executed repeatedly while the exit status of the first list is zero. The until command is similar, but has the word until in
place of while, which causes it to repeat until the exit status of the first list is zero.

The syntax of the for command is

for variable [ in [ word ... ] ]
do list
done
So, no mention of the standard for-loop. Weird. I just can't believe it.

I'll check what the usage of fflush() does in my test programs above and update again in a bit.
 
Old 10-11-2010, 08:31 AM   #14
GrapefruiTgirl
Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Original Poster
Rep: Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550
Haven't checked fflush() effects yet, but here's the simple workaround for lack of a for loop, using `seq` instead:
Code:
#!/bin/dash

for x in $(seq 1 100); do
  echo hello
done
Though I still cannot believe the familiar construct earlier doesn't work.
 
Old 10-11-2010, 08:48 AM   #15
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 15

Rep: Reputation: 233Reputation: 233Reputation: 233
@GrapefruiTgirl, @grail I think by default awk closes all file descriptors that are still open when it exits. It's just a way of demonstrating how opening and closing of command pipes works in awk.

@GrapefruiTgirl I sometimes call associative arrays as hashes or vice versa. So with internal hashes in awk, any command string can be associated with a single value.. a single fd that is.

Here's an example awk script that necessarily requires close():

http://www.linuxquestions.org/questi...0/#post4066219

Last edited by konsolebox; 10-11-2010 at 08:52 AM.
 
  


Reply

Tags
awk stdin syntax


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Problem with Pango Install, make giving error 2 luisfong11 Linux - Software 4 12-29-2009 04:25 PM
Hello, I am entered here, Make me comfortable by giving support bitras LinuxQuestions.org Member Intro 1 12-13-2009 01:54 PM
'make' giving errors... bharathp Linux - Hardware 9 11-21-2009 11:12 PM
How to make 2 variables from one variable value in awk intikhabalam Linux - General 1 07-30-2008 04:32 AM
How to make extra stdin input in awk ? khaan Programming 3 07-30-2007 05:04 AM


All times are GMT -5. The time now is 05:56 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration