LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Kernel (https://www.linuxquestions.org/questions/linux-kernel-70/)
-   -   Single cpu-core used on multicore system when processes communicate with pipe (https://www.linuxquestions.org/questions/linux-kernel-70/single-cpu-core-used-on-multicore-system-when-processes-communicate-with-pipe-718145/)

merijnv 04-10-2009 03:48 AM

Single cpu-core used on multicore system when processes communicate with pipe
 
For scientific experiments I am a user on an interesting computer. It has an enormous amount of ram (128G), and 16 cores (4 quad-core amd-64 processors). Uname -a says:
Linux cn51 2.6.27.19-78.2.30.fc9.x86_64 #1 SMP Tue Feb 24 19:44:45 EST 2009 x86_64 x86_64 x86_64 GNU/Linux


If I start three jobs which are cpubound, yet these processes communicate through a pipe, I see only one CPU in use; and the user-percentages count up to exactly 100% (one core). I must admit that an enormous amount of data flows through these pipes: i process megabyte a second or so.

In top, using the 1 line per processor, I see that the processes are always on the same CPU but on different cores. The user-times add up to almost exactly 100%, there is no significant sys-time measures; idle+user is also close to 100% per core.

Is there a kernel-level parameter that can be tweaked to ask the scheduler for another way of scheduling these?

Or, maybe even better, something in the environment I can set as user?

syg00 04-10-2009 06:11 AM

Quote:

Originally Posted by merijnv (Post 3504396)
Is there a kernel-level parameter that can be tweaked to ask the scheduler for another way of scheduling these?

Or, maybe even better, something in the environment I can set as user?

Nope.
If you are passing data via a pipe, you have a (effectively) synchronous write/read arrangement.
Hence you will never see more than what appears to be one processor in use.
The scheduler will attempt to enforce (hardware) locality totake advantage of cache coherency. Hence you see dispatch on just the one substrate.

If you can, redesign the work for better multi-tasking.

johnsfine 04-10-2009 07:47 AM

Quote:

Originally Posted by merijnv (Post 3504396)
Is there a kernel-level parameter that can be tweaked to ask the scheduler for another way of scheduling these?

Or, maybe even better, something in the environment I can set as user?

Quote:

Originally Posted by syg00 (Post 3504463)
Nope.

That "Nope" is certainly correct.

Quote:

If you are passing data via a pipe, you have a (effectively) synchronous write/read arrangement.
But that is significantly overstating the problem.

You should be able to pass data via a pipe and still have overlapping processing.

Something is wrong with either your overall design, or the details of the way you write to each pipe.

You cannot get data out of a pipe before the other side puts the data in. So assuming the processing on the read side of the pipe requires the data, it must wait for it.

But when you write to a pipe, waiting for the other side to read the result is optional (I wish I understood the exact details myself. Some of the documentation is unclear, especially since I'm trying to use mostly the same source code between Windows and Linux. But I don't expect your situation is that complex).

Your write side almost certainly should not be waiting for the other side to read. It should be able to generate the next data as soon as it throws the last data into the pipe.

Quote:

If you can, redesign the work for better multi-tasking.
Of course. The first step is understanding why two out of three processes are waiting at any given moment.

If you explain a bit more about the data flow between the three processes, we might have some better ideas on how to identify and correct the undesired waits.

syg00 04-10-2009 09:17 PM

Quote:

Originally Posted by johnsfine (Post 3504529)
But that is significantly overstating the problem.

I was trying to describe what I perceived the issue to be, not pipes per se.
Maybe I should have said something like
"If you are passing data via a pipe, it appears you have a (effectively) synchronous write/read arrangement."


All times are GMT -5. The time now is 04:48 AM.