LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   pipe buffering/blocked write caching question (https://www.linuxquestions.org/questions/programming-9/pipe-buffering-blocked-write-caching-question-799727/)

genmaicha 04-03-2010 01:23 AM

pipe buffering/blocked write caching question
 
I understand that the linux pipe is a buffer and that any data written to it will stay there until it is read, and if the max capacity of the buffer is reached, any additional writes will block (by default).

HOWEVER, the behavior of the pipeline below suggest that the write operations are buffered/cached before ever being written to the pipe on the client side

here is write.sh, which creates 1000 byte string and writes it 100 times to stdout... the idea being that it'll block as soon as the 64kb linux pipe size is reached:
Code:

# create a 1kb buffer named chunk
chunk=""
for i in `seq 1 100`
do
  chunk="${chunk}0123456789"
done

# write it to stdout 100 times
for i in `seq 1 100`
do
  echo writing chunk $i >&2
  echo $chunk
done

here is read.sh, which reads the 1000 byte chunks one at a time but has a sleep statement in order to 'force' filling the pipe:
Code:

a=1
while read line
do
  echo `date`: read chunk $a
  a=$((a+1))
  sleep .5
done

here is (part of) the output of "./write.sh | ./read.sh":
Code:

writing chunk 1
writing chunk 2
writing chunk 3

... <omitted> ...

writing chunk 55
writing chunk 56
writing chunk 57
writing chunk 58
Fri Apr 2 23:01:49 PDT 2010: read chunk 1
writing chunk 59
writing chunk 60
writing chunk 61
writing chunk 62
writing chunk 63
writing chunk 64
writing chunk 65
Fri Apr 2 23:01:50 PDT 2010: read chunk 2
Fri Apr 2 23:01:51 PDT 2010: read chunk 3
writing chunk 66
writing chunk 67
writing chunk 68
writing chunk 69
Fri Apr 2 23:01:51 PDT 2010: read chunk 4
Fri Apr 2 23:01:52 PDT 2010: read chunk 5
Fri Apr 2 23:01:52 PDT 2010: read chunk 6
Fri Apr 2 23:01:53 PDT 2010: read chunk 7
writing chunk 70
writing chunk 71
writing chunk 54

... <omitted> ...

This is not what I was expecting: I was expected that once the capacity was reached, any reads would be followed immediately by a write to take advantage of the freed space. Instead, the blocked write operation seems to wait for some random amount of time/space to free until it unblocks and writes.

What exactly is going on, what's it called, and how can I control this behavior?

Sergei Steshenko 04-04-2010 12:45 PM

Quote:

Originally Posted by genmaicha (Post 3922577)
I understand that the linux pipe is a buffer and that any data written to it will stay there until it is read, and if the max capacity of the buffer is reached, any additional writes will block (by default).

HOWEVER, the behavior of the pipeline below suggest that the write operations are buffered/cached before ever being written to the pipe on the client side

here is write.sh, which creates 1000 byte string and writes it 100 times to stdout... the idea being that it'll block as soon as the 64kb linux pipe size is reached:
Code:

# create a 1kb buffer named chunk
chunk=""
for i in `seq 1 100`
do
  chunk="${chunk}0123456789"
done

# write it to stdout 100 times
for i in `seq 1 100`
do
  echo writing chunk $i >&2
  echo $chunk
done

here is read.sh, which reads the 1000 byte chunks one at a time but has a sleep statement in order to 'force' filling the pipe:
Code:

a=1
while read line
do
  echo `date`: read chunk $a
  a=$((a+1))
  sleep .5
done

here is (part of) the output of "./write.sh | ./read.sh":
Code:

writing chunk 1
writing chunk 2
writing chunk 3

... <omitted> ...

writing chunk 55
writing chunk 56
writing chunk 57
writing chunk 58
Fri Apr 2 23:01:49 PDT 2010: read chunk 1
writing chunk 59
writing chunk 60
writing chunk 61
writing chunk 62
writing chunk 63
writing chunk 64
writing chunk 65
Fri Apr 2 23:01:50 PDT 2010: read chunk 2
Fri Apr 2 23:01:51 PDT 2010: read chunk 3
writing chunk 66
writing chunk 67
writing chunk 68
writing chunk 69
Fri Apr 2 23:01:51 PDT 2010: read chunk 4
Fri Apr 2 23:01:52 PDT 2010: read chunk 5
Fri Apr 2 23:01:52 PDT 2010: read chunk 6
Fri Apr 2 23:01:53 PDT 2010: read chunk 7
writing chunk 70
writing chunk 71
writing chunk 54

... <omitted> ...

This is not what I was expecting: I was expected that once the capacity was reached, any reads would be followed immediately by a write to take advantage of the freed space. Instead, the blocked write operation seems to wait for some random amount of time/space to free until it unblocks and writes.

What exactly is going on, what's it called, and how can I control this behavior?

There are a lot of factors involved - process switching, stdout latency, etc.

neonsignal 04-04-2010 05:51 PM

Quote:

Instead, the blocked write operation seems to wait for some random amount of time/space to free until it unblocks and writes.
I think you will find that it is not random; the threshold appears to be 4096 empty bytes before the sending process can proceed. Even if you can change this threshold (I'm not sure it is possible, I'm guessing it is the pipe block size), your system logic shouldn't depend on it.

You cannot expect this streaming to be unbuffered when you are using buffered (high level) reads and writes.


All times are GMT -5. The time now is 04:55 PM.