Stripping lines versus stripping bytes in a bash subshell.
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
Hello,
you are looking for the tail command.
Anyway, it looks like your implementation of head only reads the bytes needed from stdin with the --bytes option, while it reads all the file with the --lines parameter.
Anyway, it looks like your implementation of head only reads the bytes needed from stdin with the --bytes option, while it reads all the file with the --lines parameter.
I'm using slackware (slax) with GNU coreutils 6.12.
Your guess seems not to agree with experiments:
Code:
bash-3.1# ( for I in $(seq 1 8) ;do echo $I;done )| (head -2 >/dev/null; cat )
3
4
5
6
7
8
bash-3.1# ( for I in $(seq 1 8) ;do echo $I;done ) > a; cat a | (head -2 >/dev/null ;cat)
bash-3.1# ( for I in $(seq 1 1859) ;do echo $I;done ) > a; cat a | (head -2 >/dev/null ;cat)
bash-3.1# ( for I in $(seq 1 1860) ;do echo $I;done ) > a; cat a | (head -2 >/dev/null ;cat)
bash-3.1# ( for I in $(seq 1 1861) ;do echo $I;done ) > a; cat a | (head -2 >/dev/null ;cat)
1861
bash-3.1#
So there is a difference between chopping head on the fly and doing a passage inbetween with a real file.
The weird thing is that upon substituting "head -2" above with "sed -e '2 q'" (which should be effectively the same) the magic number 1860 lowers, while the first example works identically.
Well, I suppose the head command is buffering the input somehow. With the --bytes parameter, it know exactly how many bytes it has to read, therefore it reads exactly that. Without the --bytes command, it does not know how much bytes to read, so it reads a big chunk of data to be analyzed. In the first case, input is coming slowly, line by line, so it has the time to parse it and stop reading. When you use the cat command, input is coming fast, so the head command reads a big chunk of data before it parses it.
Anyway, is using the read command an option?
Code:
( for I in $(seq 1 8) ;do echo $I;done )|(read;read;cat)
I'm very puzzled; the behaviour is not consistent. I ran the following command repeatedly (by using up arrow at the command prompt to recall it)
Code:
c:~$ ( for I in $(seq 1 8) ;do echo $I;done )| (head --lines=2 >/dev/null; cat )
Sometimes it produced no output and sometimes (roughly half of the times for each)
Code:
3
4
5
6
7
8
Here are relevant software versions
Code:
c:~$ cat /etc/slackware-version
Slackware 13.0.0.0.0
c:~$ head --version
head (GNU coreutils) 7.4
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by David MacKenzie and Jim Meyering.
c:~$ cat --version
cat (GNU coreutils) 7.4
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Torbjorn Granlund and Richard M. Stallman.
c:~$ bash --version
GNU bash, version 3.1.17(2)-release (i486-slackware-linux-gnu)
Copyright (C) 2005 Free Software Foundation, Inc.
Thanks Agrouf and Catkin for your feedback.
I actually had resolved by using line command similarly to what suggested by Agrouf, altough in a less satisfactory manner than I would have liked if sed and head had behaved as expected.
I posted the issue because it seems relevant with respect to on-the-fly implementations like the one treated in the link I give previously.
Also, I suppose Posix specifications should face the issue and dictate some rules, but I could not find anything on a first skimming through them.
I also did a Catkin-like trial , by issuing the following commandline and waiting a few seconds (I omit the first tenths lines):
Not only output is unexpected, but also when quitting less pager by pressing 'q', I'm not returned directly to the bash prompt. I have to add a Ctrl-C to do that, which is totally unexperienced to me.
I believe the head command is not supposed to be used like that anyway. It can read all the file, or what it wants depending on the implementation. I've just tested on AIX and there the head command reads it all either way, even with the -c argument (same as --bytes for GNU)
You should not expect head to read any specific amount of data.
seriously...all you ever need is awk for what you are doing in post #1
Such a flat and useless statement.
1) I'm not trying to do anything in my first post, just exposing a phenomenon.
2) Although all started actually for a script I was setting up, you don't know what I needed to do originally and no, awk wouldn't have solved it aptly.
3) I already said the originating problem was already solved, so this was clearly a thread for its own sake.
4) you didn't add a bit to the core of the discussion, which is not about how to strip two top lines, that's a no-brainer; rather about what one expects from piping filters.
So, back away from diverting noises to the real matters, thanks to Agrouf for testing elsewhere unix. Probably you're right about what to expect from head, still I think that it's a pity to break the piping metaphore (one could say this issue violates conservation of matter, precisely water, to stay on the model), which is so simple and powerful.
It permits doing such things with few keystrokes, I hold it as one of the main gems left from original unix concepts.
And, anyway, if that can be called an unexpected behaviour, I strongly suspect that subsequent malfunctions (see last posts of catkin's and mine) are due to some bug.
I guess it's not so much defective behaviour as evidence of an asynchronous phenomenon; if either of head or cat (or wc) find their stdin empty they will exit and the pipeline (and sub-shells) are demolished even if the data-generating component had not finished. I have to go now so cannot try it myself; what happens if a sleep 1 is introduced before the data reading components start? I think that would produce consistent behaviour.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.