Q: are pipelines processed in parallel/serial?
Do pipelines start multiple threads so that data potentially flows from start to final output at once? (Given enough CPUs)
For instance... cat data | awk 'some conditional operations' | sed 's/xyz/def/' | tr '@' ':' | awk 'some more' | sort Do each of these stages have their own process? Is cat assured to finish before handing over control to awk? Many thanks for clearing this up :P ~James |
hi,
And welcome to LQ! Just a thought: I'd guess this depends on the receiving process, not on the pipe. E.g. it may make sense for awk as the receiver (right-hand side of a pipe) to do its thing to each line as they dash past, but it certainly doesn't make sense to start sorting before you've seen all data. Pure speculation, I never really thought about it in the past ;} Cheers, Tink |
Hi.
The pipes are set up in parallel, but as Tink suggests, it's up to the utilities whether they pass the data through 'live' or not. Try ping into awk: $ ping localhost | awk '{print $(NF-1)}' and it works in parallel Then stick sed on the end: $ ping localhost | awk '{print $(NF-1)}' | sed '1d' and sed waits for the end of the input stream before doing anything. Dave |
"man 7 pipe"
|
thanks all ;)
|
A "pipe" is an inter-process communication (IPC) channel. (It's one of several.)
Basically, a "pipe" presents itself as a file, which a particular process can either read from or write to. The "pipe," then, becomes a buffered communications-mechanism between its "reader" and its "writer," always appearing to both of them as "just a file." But here's the magic...
When you, in the shell, type something like ls | grep foo, you actually cause two processes to be launched: one is ls, which writes its output to its STDOUT, and the other is grep, which reads its input from its STDIN. And... (magic time!) the STDOUT from the one is the STDIN of the other! It's a pipe. (Please step outside the room while your brain explodes. We've all been there... we don't mind. Now, when you come back into the room, you ought to be saying either "Sweet!" or else, "That is so way k-e-w-e-l!"] Yeah, those dudes at Bell Labs way back in the 1970's {I was almost-there, but nevermind!} had some pretty mind-blowing ideas... |
There are two major products that come out of Berkeley: LSD and UNIX. We don't believe this to be a coincidence.
- Jeremy S. Anderson |
There are times when I think everything would make a lot more sense if I were on LSD. Then I remember I prefer the purple pills the doctor gives me. *munch* *munch*
|
In addition to the great explanation by sundialsvcs, there is another thing about unix pipes which is useful but requires care in some circumstances (actually there are a few other such things, but I will only talk about one that relates somewhat to your question): When a process is writing to a pipe and the file descriptor on the “read” end has been closed, the process is sent a SIGPIPE signal. The default action for receiving such a signal (i.e., the action taken unless the program explicitly handles or ignores the signal) is to terminate the program.
As you might imagine, this behavior presents great potential for use. Here is a prototypical example of the kind of situation for which it was intended: suppose you have a really big gzip file, of which you want to read the first few lines to see what is inside. You could do something like this: Code:
zcat reallybigfile.gz | head As you might also imagine, you can also abuse this functionality, and it might even get you into trouble if you aren’t careful. |
This is probably one of the more interesting linux topics I've covered so far. Thanks a lot guys :)
|
All times are GMT -5. The time now is 05:59 AM. |