BASH comments
Sometimes code readability is improved by breaking a pipe into individual commands, one per line.
Contrived example: Code:
cat $InFile \ Code:
cat $InFile \ # Read input file Is there a way to do this? Daniel B. Martin |
No, I don't believe that's possible. the backslash is there to escape the non-printing newline character that normally terminates the line. There can be nothing else following it.
To the shell, the multiple lines appear to be a single line, so the comments can only come after the last one. |
If you put the pipe at the end of the line Bash will continue with the next line. No need for the backslash.
Code:
$ cat a I failed to find any support for this in the GNU Bash Reference Manual though, so caveat emptor. |
Quote:
Actually, it seems to be obliquely defined: all these operators require both sides to exist -- an empty command on either side makes absolutely no sense for these, and a comment or newline or empty line(s) do not produce any statement. The situation is complementary to 'do' or 'then' statements -- they need a preceding semicolon or a new line in Bash, Bourne shells and derivatives and POSIX shells -- so this behaviour is quite intuitive and useful. I just wish it was explicitly documented somewhere. |
Quote:
Code:
cat $InFile \ # Read input file |
Quote:
Code:
test$ ls -gGh |
Quote:
|
I don't see the point of so many comments. Just make a comment on what the pipe accomplishes. For example:
Code:
# cut off last 6 fields |
Quote:
When writing a complicated piece of code (regardless of language) I like to express in words the logic I want to implement. Then, one piece at a time, I fill in the code. When done, my code is fully commented because I started with all comments. Daniel B. Martin |
Well, I do write a lot of scripts, and usually if a command becomes too complicated I split it up, I don't do it all at once.
This helps not only because you can comment each part, but because it is easier to understand. Oftentimes there are different and more readable ways to do things that don't involve huge commands or long pipes. Either way Telengard posted what you want. |
I myself like to use subshells when working with long pipes, i.e.
Code:
( Code:
( echo 'plot "datafile" u 1:2 t "data" w points, \' On embedded machines with a limited memory subsystem subshells may not be the best option, but for standard Intel/PowerPC et al. architectures, the subshell is forked from the parent using copy-on-write, using physically the same RAM for code and initial data structures, and therefore uses very little actual system resources. (Just a per-process kernel structure for each subshell, I believe.) This means that on a typical workstation or a server, there is no practical difference in resource use between plain pipe commands and piped subshells. |
Quote:
|
Quote:
I've just never really bothered to find out about the side effects when using command lists in a pipe (specifically, does shell state propagate or not, or if it is just inherited from the parent shell like subshells) -- and to be honest, I tend to always forget the required semicolon from the end of the command list. I've gravitated to using subshells, because I've felt them to be more intuitive. To those that are unaware of the semicolon detail with command lists, the equivalent command list variant of piped subshells, Code:
( echo foo ) | ( cat ; echo bar ) Code:
{ echo foo ; } | { cat ; echo bar ; } Code:
{ echo foo } | { cat ; echo bar } |
Quote:
|
I have no problem with command grouping. They're basically just an anonymous functions. And it only takes getting caught by the final semicolon thing a few times before you learn to watch out for it.
|
With subshells, each subshell inherits its state from the parent; changes never propagate back. With command lists, the state is shared with the parent but only if the command list is not part of a pipe.
Code:
x=5 ; { x=6 ;} ; echo $x Code:
x=5 ; { x=6 ;} | { x=7 ;} ; echo $x I'm not sure if this is properly documented anywhere. I believe a future version of Bash might well output 7 in the latter case: running the last command list of a pipe in the original shell state might be a worthwhile optimization. (Just to be clear: x=5;(x=6);echo $x will always output 5, as will x=5;(x=6)|(x=7);echo $x .) |
Quote:
Pipelines - Bash Reference Manual Quote:
Quote:
Code:
$ unset a; a="parent_shell"
As each command group is enclosed in {;} (curly braces), one might expect the entire command line to share the same value of a. Instead, three separate shell environments each contain their own unique a. It is the pipeline which creates the subshells and propagates a into them. That's how I think it works, but as I said it seems tricky to me. |
Hmm, I was always under the impression that the part before the first pipe ran in the current environment. But it looks like I was mistaken.
In any case, this should also mean that any time you use a (..) subshell in a pipeline you end up spawning two sub-shells for it, correct? |
Quote:
Quote:
I decided to do some tests, and the results are a bit startling. Code:
strace -qf bash -c ' date | cat | cat ' 2>&1 | grep -ce 'clone(' (I believe this is related to the way Bash creates the implicit subshells. Normally, if there is only one command to run in a subshell, Bash exec's it, avoiding the unnecessary fork()/clone().) Timing tests, Code:
time bash -c 'for ((i=0; i<1000; i++)); do date | cat | cat ; done' 2>&1 >/dev/null Using more complex pipelines there is no difference between subshells and command lists: Code:
strace -qf bash -c '( date ; date ) | ( date ; cat ) | ( date ; cat )' 2>&1 | grep -ce 'clone(' These tests show that at least on my workstation, using explicit subshells in Bash pipelines is definitely a good idea. They do not use any extra resources compared to the alternatives, no extra syntax requirements compared to normal shell syntax, and the semantics are clear. Quote:
You know, up to now I have avoided using command lists in Bash. Where one might use a command list, I've used a Bash function (subshell in a pipeline) instead. Without your posts in this thread, Telengard, I would still be relying on a hazy personal preference, instead of actual knowledge. I for one have learned something new, something that I probably would not have found out on my own alone; thank you! |
Quote:
|
Quote:
Quote:
Code:
while true; do echo $((val++)); sleep 1; done All of these things make a lot of sense if you look at how a shell is written in C, but the syntax of bash makes it appear as though this behavior is idiosyncratic. In my opinion, things like this irritate people because you don't need to understand the internal limitations of bash in order to use it. Unless bash starts using the "system" idiom to call external programs or it starts routing all IPC itself, it will never get away from extensive use of subshells. Kevin Barry |
Quote:
Quote:
We're getting terribly off-topic here, but C system() function is a major source of security problems (related to quoting and escaping), and adding yet another "framework" for IPC will severely restrict the usability of Bash. I'm severely tempted to rant about applying modularity instead of framework paradigm, but that would be completely off-topic, and serve no purpose here really. I thought my tests above showed that the cost of subshells in pipelines is neglible; zero for all single-command pipe segments, and only one process per pipe segment for multi-command ones. In particular, Code:
date | # First command in the pipe, Code:
( # First command in the pipe The comment style for the first code example does work in Bash (and many other shells like tcsh, too), but I have not found it explicitly documented as working anywhere. I believe it is implicit, perhaps a side effect of the way commands are parsed, rather than anything intentional. The second code snippet, the one using subshells, is explicitly documented. (In particular, the semantics are exactly the same at least in Bash, POSIX shells, and tcsh: the state is inherited from the parent process, and changes do not propagate outside the subshell.) There are no extra syntax quirks, unlike command lists in Bash (which require the final semicolon and is whitespace sensitive). Let me put this in other words: I claim that using explicit subshells in Bash pipelines, i.e. (command(s)...)|(command(s)...)|...|(command(s)...) when comments or long commands are used, makes the code easier to write and to understand, and has no extra computing cost (run time or processes). Therefore, for complex Bash pipelines, I recommend the style used in my second code example in this post. |
This thread expanded into a more thorough exploration of the subject than anticipated.
Some languages (APL and REXX, for example) make it easy to comment in the desired fashion. Now I know it's not so easy in BASH. Okay, I can live with that. Thanks, and let's mark this one SOLVED! Daniel B. Martin |
Quote:
Quote:
Code:
$ echo $BASH_VERSION Quote:
Quote:
Quote:
Code:
$ awk 'BEGIN {print "one", #comment Quote:
|
Quote:
Kevin Barry |
Quote:
Quote:
Daniel B. Martin |
All times are GMT -5. The time now is 03:35 PM. |