LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 10-21-2009, 02:53 AM   #1
poorman_installer
LQ Newbie
 
Registered: Jul 2008
Distribution: slax
Posts: 11

Rep: Reputation: 0
Stripping lines versus stripping bytes in a bash subshell.


Can anybody explain this one?

Code:
bash-3.1# echo "1
2
3
4
5" | (head --lines=2 > /dev/null; cat)
bash-3.1#
So the above stripping don't work (no output), while the following does (the first "1" byte and its newline feed byte are stripped indeed):

Code:
bash-3.1# echo "1
2
3
4
5" | (head --bytes=2 > /dev/null; cat)
2
3
4
5
bash-3.1#

I'm puzzled. Thanks in advance.

Last edited by poorman_installer; 10-21-2009 at 02:55 AM. Reason: prose styling
 
Old 10-21-2009, 04:07 AM   #2
Agrouf
Senior Member
 
Registered: Sep 2005
Location: France
Distribution: LFS
Posts: 1,591

Rep: Reputation: 79
Hello,
you are looking for the tail command.
Anyway, it looks like your implementation of head only reads the bytes needed from stdin with the --bytes option, while it reads all the file with the --lines parameter.
 
Old 10-21-2009, 05:17 AM   #3
poorman_installer
LQ Newbie
 
Registered: Jul 2008
Distribution: slax
Posts: 11

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Agrouf View Post
Hello,
you are looking for the tail command.
Not really, see:
http://www.tomas-m.com/blog/994-Resume-your-build.html

Quote:
Anyway, it looks like your implementation of head only reads the bytes needed from stdin with the --bytes option, while it reads all the file with the --lines parameter.
I'm using slackware (slax) with GNU coreutils 6.12.
Your guess seems not to agree with experiments:

Code:
bash-3.1# ( for I in $(seq 1 8) ;do echo $I;done )| (head -2 >/dev/null; cat )
3
4
5
6
7
8
bash-3.1# ( for I in $(seq 1 8) ;do echo $I;done ) > a; cat a | (head -2 >/dev/null ;cat)
bash-3.1# ( for I in $(seq 1 1859) ;do echo $I;done ) > a; cat a | (head -2 >/dev/null ;cat)
bash-3.1# ( for I in $(seq 1 1860) ;do echo $I;done ) > a; cat a | (head -2 >/dev/null ;cat)

bash-3.1# ( for I in $(seq 1 1861) ;do echo $I;done ) > a; cat a | (head -2 >/dev/null ;cat)

1861
bash-3.1#
So there is a difference between chopping head on the fly and doing a passage inbetween with a real file.

The weird thing is that upon substituting "head -2" above with "sed -e '2 q'" (which should be effectively the same) the magic number 1860 lowers, while the first example works identically.
 
Old 10-21-2009, 05:40 AM   #4
Agrouf
Senior Member
 
Registered: Sep 2005
Location: France
Distribution: LFS
Posts: 1,591

Rep: Reputation: 79
Well, I suppose the head command is buffering the input somehow. With the --bytes parameter, it know exactly how many bytes it has to read, therefore it reads exactly that. Without the --bytes command, it does not know how much bytes to read, so it reads a big chunk of data to be analyzed. In the first case, input is coming slowly, line by line, so it has the time to parse it and stop reading. When you use the cat command, input is coming fast, so the head command reads a big chunk of data before it parses it.

Anyway, is using the read command an option?
Code:
( for I in $(seq 1 8) ;do echo $I;done )|(read;read;cat)

Last edited by Agrouf; 10-21-2009 at 05:41 AM.
 
Old 10-21-2009, 05:42 AM   #5
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Servers: Debian Squeeze and Wheezy. Desktop: Slackware64 14.0. Netbook: Slackware 13.37
Posts: 8,546
Blog Entries: 28

Rep: Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176
I'm very puzzled; the behaviour is not consistent. I ran the following command repeatedly (by using up arrow at the command prompt to recall it)
Code:
c:~$ ( for I in $(seq 1 8) ;do echo $I;done )| (head --lines=2 >/dev/null; cat )
Sometimes it produced no output and sometimes (roughly half of the times for each)
Code:
3
4
5
6
7
8
Here are relevant software versions
Code:
c:~$ cat /etc/slackware-version
Slackware 13.0.0.0.0
c:~$ head --version
head (GNU coreutils) 7.4
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by David MacKenzie and Jim Meyering.
c:~$ cat --version
cat (GNU coreutils) 7.4
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Torbjorn Granlund and Richard M. Stallman.
c:~$ bash --version
GNU bash, version 3.1.17(2)-release (i486-slackware-linux-gnu)
Copyright (C) 2005 Free Software Foundation, Inc.
 
Old 10-21-2009, 06:24 AM   #6
poorman_installer
LQ Newbie
 
Registered: Jul 2008
Distribution: slax
Posts: 11

Original Poster
Rep: Reputation: 0
Thanks Agrouf and Catkin for your feedback.
I actually had resolved by using line command similarly to what suggested by Agrouf, altough in a less satisfactory manner than I would have liked if sed and head had behaved as expected.
I posted the issue because it seems relevant with respect to on-the-fly implementations like the one treated in the link I give previously.
Also, I suppose Posix specifications should face the issue and dictate some rules, but I could not find anything on a first skimming through them.

I also did a Catkin-like trial , by issuing the following commandline and waiting a few seconds (I omit the first tenths lines):
Code:
bash-3.1# while true;do ( for I in $(seq 1 8) ;do echo $I;done )| (head --lines=2 >/dev/null; cat )|wc;done
.
.
.
      6       6      12
      6       6      12
      6       6      12
      6       6      12
      6       6      12
      6       6      12
      6       6      12
      6       6      12
      6       6      12
      0       0       0
      6       6      12
      6       6      12
      6       6      12
      6       6      12
      6       6      12
      6       6      12
      6       6      12
The failure rate is quite lower than 50% reported by Catkin, as I noticed doing tests manually.
But now an even weirder punchline; try:

Code:
bash-3.1# ( while true;do ( for I in $(seq 1 8) ;do echo $I;done )| (head --lines=2 >/dev/null; cat )|wc;done ) |less
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
      0       0       0
lines 1-30
Not only output is unexpected, but also when quitting less pager by pressing 'q', I'm not returned directly to the bash prompt. I have to add a Ctrl-C to do that, which is totally unexperienced to me.
 
Old 10-21-2009, 07:05 AM   #7
Agrouf
Senior Member
 
Registered: Sep 2005
Location: France
Distribution: LFS
Posts: 1,591

Rep: Reputation: 79
I believe the head command is not supposed to be used like that anyway. It can read all the file, or what it wants depending on the implementation. I've just tested on AIX and there the head command reads it all either way, even with the -c argument (same as --bytes for GNU)
You should not expect head to read any specific amount of data.
 
Old 10-21-2009, 07:49 AM   #8
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,695
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
seriously...all you ever need is awk for what you are doing in post #1

Last edited by ghostdog74; 10-21-2009 at 07:51 AM.
 
Old 10-21-2009, 08:20 AM   #9
poorman_installer
LQ Newbie
 
Registered: Jul 2008
Distribution: slax
Posts: 11

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by ghostdog74 View Post
seriously...all you ever need is awk for what you are doing in post #1
Such a flat and useless statement.

1) I'm not trying to do anything in my first post, just exposing a phenomenon.

2) Although all started actually for a script I was setting up, you don't know what I needed to do originally and no, awk wouldn't have solved it aptly.

3) I already said the originating problem was already solved, so this was clearly a thread for its own sake.

4) you didn't add a bit to the core of the discussion, which is not about how to strip two top lines, that's a no-brainer; rather about what one expects from piping filters.

So, back away from diverting noises to the real matters, thanks to Agrouf for testing elsewhere unix. Probably you're right about what to expect from head, still I think that it's a pity to break the piping metaphore (one could say this issue violates conservation of matter, precisely water, to stay on the model), which is so simple and powerful.
It permits doing such things with few keystrokes, I hold it as one of the main gems left from original unix concepts.

And, anyway, if that can be called an unexpected behaviour, I strongly suspect that subsequent malfunctions (see last posts of catkin's and mine) are due to some bug.
 
Old 10-21-2009, 08:36 AM   #10
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Servers: Debian Squeeze and Wheezy. Desktop: Slackware64 14.0. Netbook: Slackware 13.37
Posts: 8,546
Blog Entries: 28

Rep: Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176Reputation: 1176
I guess it's not so much defective behaviour as evidence of an asynchronous phenomenon; if either of head or cat (or wc) find their stdin empty they will exit and the pipeline (and sub-shells) are demolished even if the data-generating component had not finished. I have to go now so cannot try it myself; what happens if a sleep 1 is introduced before the data reading components start? I think that would produce consistent behaviour.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Bash and netcat: Stripping http header Prokke Programming 9 10-05-2009 02:58 AM
Kernel-stripping Dinithion Linux - General 2 05-29-2009 09:58 AM
stripping ^m recursively. SaxiDawg Linux - General 1 06-07-2006 04:27 AM
Chapter 5.33 Stripping rikpotts Linux From Scratch 2 02-05-2006 09:42 PM
stripping of bash code? Lindows45 Linux - Newbie 2 03-01-2004 07:51 AM


All times are GMT -5. The time now is 08:54 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration