LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-19-2022, 12:07 AM   #16
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,882
Blog Entries: 1

Rep: Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871

> If my command launches 15 processes, are 14 of them sleeping at any given time?

The number of CPU-cores limits the number of paralel running processes, also the fact that the reader of a pipe has to wait for the writer of that pipe. (And vice verse, as the pipe has a limited capacity.)

Last edited by NevemTeve; 08-19-2022 at 12:09 AM.
 
1 members found this post helpful.
Old 08-19-2022, 12:13 AM   #17
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,057

Rep: Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349
Quote:
Originally Posted by halfpower View Post
My question was whether or not the processes would block the execution of the other processes. In other words: If my command launches 15 processes, are 14 of them sleeping at any given time? If they are, it is very sub-optimal.
this is the official documentation: https://docs.kernel.org/scheduler/index.html
The people who created that kernel code made a lot of work to optimize this, so it is definitely not sub-optimal. Anyway, if you think you can do a better job just do it (or at least explain how can it be improved).
 
1 members found this post helpful.
Old 08-19-2022, 05:58 AM   #18
GazL
LQ Veteran
 
Registered: May 2008
Posts: 6,920

Rep: Reputation: 5038Reputation: 5038Reputation: 5038Reputation: 5038Reputation: 5038Reputation: 5038Reputation: 5038Reputation: 5038Reputation: 5038Reputation: 5038Reputation: 5038
Quote:
Originally Posted by dugan View Post
Yes. The short answer is yes, and it's been explained to you exactly how that works. How many times do you need to hear "yes" before it gets through to you?
Actually, the short answer is "no". Given a pipeline of 15 stages, 14 won't always be sleeping. The in/out data rates of the individual stages of the pipeline will determine which individual components are blocking on input or output — owing to full or empty fifo buffers — at any given time.

Assuming there is no competition for cpu resource, the overall performance of the pipeline will be dependent upon data-flow: constrained by the rate of its slowest component.
 
Old 08-19-2022, 11:57 AM   #19
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,257

Rep: Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338
I see that the OP tagged this thread with “asynchronous task”.

Does he understand that pipelines are not “asynchronous”?
 
Old 08-19-2022, 12:12 PM   #20
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 9,883
Blog Entries: 13

Rep: Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931
You can't just take a single example and use it as a basis for a performance discussion, unless the command sequence is one you specifically wish to optimize. But when you expand the question to cite "if 15 processes ...", then that's a different, but still specific and singular question.

Recommend reviewing the link provided by pan64 about the scheduler.
 
Old 08-19-2022, 12:20 PM   #21
halfpower
Member
 
Registered: Jul 2005
Distribution: Slackware
Posts: 241

Original Poster
Rep: Reputation: 31
Quote:
Originally Posted by boughtonp View Post
What issue?

Put another way: You don't have a performance issue until you can demonstrate a measurable issue.

Whether the answer to your question is yes or no, what difference is it going to make?

If it runs quickly enough, nobody cares what core it executes on.
Imagine if there was a terrabyte of data. Would it be worthwhile to consider optimizations?

Quote:
If it doesn't run quickly enough, switching to forced parallel execution is going to make the code less maintainable, and very likely having less of an impact than optimising whatever algorithm(s) might be involved and/or using a lower-level language for the task.
Sometimes there's a better way to do something, and I don't know about it.

Quote:
It's really not.
The prior script was IO bound. The new one should be CPU bound.

Quote:
Even if you add the missing -z argument to sed, the backslash shouldn't be escaped and the group is unnecessary, but it's far simpler to use tr to replace newlines.
The backslash is needed to escape backslashes.

Quote:
But if we pretend you did use tr, you still have to consider that jq (unlike grep/sed) will wait for stdin to complete before parsing the object, so it doesn't demonstrate any meaningful simultaneous execution.
When jq is processing data, bunzip (theoretically at least) can read more data from disk and decompress it. Likewise, when sed is processing data, jq can read from stdin and process that.

In other words, if the programs do not execute simultaneously, that would be like having assembly line with one car. In the (newer) code example, there's clearly room for eight cars.
 
Old 08-19-2022, 12:35 PM   #22
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,257

Rep: Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338
Sed cannot read from stdin when jq is processing data. If you still don’t understand that, then don’t come back until you do.
 
Old 08-19-2022, 12:36 PM   #23
halfpower
Member
 
Registered: Jul 2005
Distribution: Slackware
Posts: 241

Original Poster
Rep: Reputation: 31
Quote:
Originally Posted by dugan View Post
jq in particular needs to wait for the entire curl or bunzip command to finish before it even starts.
True. However, jq does not need to wait for sed!
 
Old 08-19-2022, 12:41 PM   #24
halfpower
Member
 
Registered: Jul 2005
Distribution: Slackware
Posts: 241

Original Poster
Rep: Reputation: 31
Quote:
Originally Posted by dugan View Post
Sed cannot read from stdin when jq is processing data. If you still don’t understand that, then don’t come back until you do.
I think you are reffering to this code:
Code:
bunzip2 really_big_file.bz2 --stdout\
| jq .text\ 
| sed -E 's:(\\n): :g;'
In theory, sed can read from stdin while bunzip is processing data!
 
Old 08-19-2022, 12:48 PM   #25
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,257

Rep: Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338
What do you want to be told?

EDIT: Or, more specifically: what, exactly, do you want to get out of this thread?

Last edited by dugan; 08-19-2022 at 12:57 PM.
 
Old 08-19-2022, 12:58 PM   #26
halfpower
Member
 
Registered: Jul 2005
Distribution: Slackware
Posts: 241

Original Poster
Rep: Reputation: 31
Quote:
Originally Posted by rtmistler View Post
You can't just take a single example and use it as a basis for a performance discussion, unless the command sequence is one you specifically wish to optimize. But when you expand the question to cite "if 15 processes ...", then that's a different, but still specific and singular question.

Recommend reviewing the link provided by pan64 about the scheduler.
The link looks potentially useful. Although it's a lot to digest for a question that I had intended to be a broadly generalizable about Bash programming.
 
Old 08-19-2022, 02:30 PM   #27
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 9,883
Blog Entries: 13

Rep: Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931
Quote:
Originally Posted by halfpower View Post
The link looks potentially useful. Although it's a lot to digest for a question that I had intended to be a broadly generalizable about Bash programming.
But yet you expanded your question beyond bash.

That's all I can help with here, sorry but the question continues to change scope and it's unclear what you're looking for.
 
Old 08-19-2022, 04:17 PM   #28
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,631

Rep: Reputation: 2557Reputation: 2557Reputation: 2557Reputation: 2557Reputation: 2557Reputation: 2557Reputation: 2557Reputation: 2557Reputation: 2557Reputation: 2557Reputation: 2557
Quote:
Originally Posted by halfpower View Post
... a question that I had intended to be a broadly generalizable about Bash programming.
The generalization is: Bash is a tool for convenience, not performance.

 
Old 08-20-2022, 03:32 AM   #29
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,057

Rep: Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349Reputation: 7349
yes, it is completely unclear what is it all about?
Using a pipe chains like that mentioned bzip| jq | cut | rev | whatever mean: all these commands will be started in the same time, will be executed independently from each other, only the output of one will be sent to the next in the chain. They will not wait for completion of any other member of the chain, but will wait for something to work with (=input data).

And it is actually completely independent from bash, this is handled, executed, processed and driven by the kernel. bash is just a language where you can construct pipe chains like this, but not the only one.
(therefore the speed of the execution again does not depend on bash at all)
 
Old 08-20-2022, 11:53 AM   #30
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,257

Rep: Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338Reputation: 5338
Well, this is at least part of the story. For halfpower:

Process Life Cycle: States

Last edited by dugan; 08-20-2022 at 12:41 PM.
 
  


Reply

Tags
asynchronous task, blocking, command line, concurrency, pipes



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
What route to access daisy chained 2d router 192.168.1.1 after 192.168.0.1 (subnets?) Emmanuel_uk Linux - Networking 6 05-05-2006 01:47 AM
how to create a chained js web form ? graziano1968 Programming 2 11-12-2004 03:55 AM
mounting a daisy chained firewire drive jamida Linux - Newbie 1 05-30-2004 09:08 PM
Daisy Chained Parallel Devices in Linux? JockVSJock Linux - Hardware 2 03-29-2004 08:58 PM
Daisy-chained || printer beckwith Linux - Hardware 0 08-28-2003 02:50 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 06:51 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration