LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices



Reply
 
Search this Thread
Old 12-07-2010, 01:12 PM   #1
hillgreen
LQ Newbie
 
Registered: Dec 2010
Posts: 4

Rep: Reputation: 0
Exclamation why: the same program runs multiple times but get very different results


I used wait4 to time a child process. but the results differ each other dramatically. why?
my core source code lists as below.
to the point: in linux 2.6 kernel, how to time a process with a high precision. and, only its user-time rather than the elapsed time???

static struct rusage ruse;
static pid_t u_pid;

int main(int argc, char *argv[])
{
int i;
for (i = 0; i < 10; i++) {
u_pid = fork();

if (u_pid == 0) {
execl("./anti", "anti", NULL);

}
else if (u_pid > 0) {
int sec, usec;
float usetime;
pid_t repid = wait4(u_pid, NULL, WUNTRACED, &ruse);
usec = ruse.ru_utime.tv_usec;
sec = ruse.ru_utime.tv_sec;
usetime = sec + usec / (float)1000000;
printf("%g\n", usetime);
}
}
return 0;
}

and my result:
1.08807
1.06407
1.08807
1.05607
1.06807
1.07607
1.06807
1.07607
1.07207
1.08407
 
Old 12-08-2010, 06:31 AM   #2
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by hillgreen View Post
I used wait4 to time a child process. but the results differ each other dramatically. why?
my core source code lists as below.
to the point: in linux 2.6 kernel, how to time a process with a high precision. and, only its user-time rather than the elapsed time???

static struct rusage ruse;
static pid_t u_pid;

int main(int argc, char *argv[])
{
int i;
for (i = 0; i < 10; i++) {
u_pid = fork();

if (u_pid == 0) {
execl("./anti", "anti", NULL);

}
else if (u_pid > 0) {
int sec, usec;
float usetime;
pid_t repid = wait4(u_pid, NULL, WUNTRACED, &ruse);
usec = ruse.ru_utime.tv_usec;
sec = ruse.ru_utime.tv_sec;
usetime = sec + usec / (float)1000000;
printf("%g\n", usetime);
}
}
return 0;
}

and my result:
1.08807
1.06407
1.08807
1.05607
1.06807
1.07607
1.06807
1.07607
1.07207
1.08407
Why do you call the differences dramatic, i.e. which differences would you consider to be non-dramatic ? Justify your calculations of non-dramatic differences and your expectation WRT to the results in general.
 
Old 12-09-2010, 06:06 AM   #3
hillgreen
LQ Newbie
 
Registered: Dec 2010
Posts: 4

Original Poster
Rep: Reputation: 0
the differences is about 10ms. I define the precision 0.5ms. And in my opinion, time elapsed in user space can be attained according to usage. But it does work now. Why not? and How to measure the time a programe used for its execution in sys or usr space with the precision 0.5ms ??

Last edited by hillgreen; 12-09-2010 at 06:07 AM. Reason: more info
 
Old 12-09-2010, 06:09 AM   #4
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by hillgreen View Post
the differences is about 10ms. I define the precision 0.5ms. And in my opinion, time elapsed in user space can be attained according to usage. But it does work now. Why not? and How to measure the time a programe used for its execution in sys or usr space with the precision 0.5ms ??
Do you understand how your computer, including the CPU, works ? Do you know which big parts the system consists of ? Have you ever heard the words "cache", "DRAM" ?
 
0 members found this post helpful.
Old 12-09-2010, 10:44 AM   #5
salasi
Senior Member
 
Registered: Jul 2007
Location: Directly above centre of the earth, UK
Distribution: SuSE, plus some hopping
Posts: 3,919

Rep: Reputation: 779Reputation: 779Reputation: 779Reputation: 779Reputation: 779Reputation: 779Reputation: 779
Quote:
Originally Posted by hillgreen View Post
I used wait4 to time a child process. but the results differ each other dramatically. why?
They are all 1.0-something, or another. I agree with Sergei that this is not dramatic. It may not meet your requirements, but that is a rather different issue.

Quote:
1.08807
1.06407
1.08807
1.05607
1.06807
1.07607
1.06807
1.07607
1.07207
1.08407
Did you actually look at your data? All of your data points (from this run) end ...07, in fact 207, 407, 607 and 807. Those will be completely random numbers, of course

Looking at your pattern, you could come to some conclusion about what could and could not be captured as a result of these runs.
 
Old 12-09-2010, 11:20 AM   #6
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
To the OP: you apparently have an expectation of constant execution time, and this expectation is wrong - regardless of timer accuracy. That's why I wrote:

Quote:
Justify your calculations of non-dramatic differences and your expectation WRT to the results in general.
.
 
0 members found this post helpful.
Old 12-09-2010, 01:32 PM   #7
jiml8
Senior Member
 
Registered: Sep 2003
Posts: 3,171

Rep: Reputation: 115Reputation: 115
The execution time probably is constant, but what the OP in the program was the total wall clock time to execute the program, which of course includes context switches, and other things that the system does during that time.

What the OP apparently wants is to determine the total CPU time taken by the specific child process.

Take a look at getrusage(). Also look at /proc/stat. The answers you want are in those areas.
 
Old 12-09-2010, 02:29 PM   #8
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by jiml8 View Post
The execution time probably is constant ...
No, it is not. By construction of modern HW in the first place, by context switches, by other HW processes.

Last edited by Sergei Steshenko; 12-09-2010 at 02:31 PM.
 
0 members found this post helpful.
Old 12-09-2010, 02:56 PM   #9
jiml8
Senior Member
 
Registered: Sep 2003
Posts: 3,171

Rep: Reputation: 115Reputation: 115
Quote:
Originally Posted by Sergei Steshenko View Post
No, it is not. By construction of modern HW in the first place, by context switches, by other HW processes.
Please try to pay attention, and read carefully and correctly.

The execution time probably is constant. Every time the program "anti" is run...whatever anti is...it probably runs in the same number of CPU cycles and bus cycles. This is almost certainly true. Period.

Now, if you had taken the time to pay attention and actually read the rest of the sentence I wrote - which you clearly did not because you only clipped the first portion of it, suggesting that your attention span is only that long - you would have seen that I stated that OP was actually getting the wall clock time which included things like context switches.

I've seen you do this too many times. And on this particular thread, you have done it again. And following your usual pattern you took a patronizing tone with OP while NOT presenting a solution. You will note that I did tell the OP where to look for a solution.

Remember one thing, Junior. No matter how much you think you know, there's bound to be someone around who knows more than you do. And in this case, on this topic, I'm that someone.

Last edited by jiml8; 12-09-2010 at 02:58 PM.
 
Old 12-09-2010, 03:03 PM   #10
paulsm4
Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Sergei is absolutely correct - timings WON'T be accurate to the microsecond. Or probably not even to the millisecond.

@hillgreen -

Here are a couple of other quick/easy ways to get "timings". It might be interesting to compare the results. And - equally interesting - to compare the variance between successive runs.

1. Use "time" in a shell script
Code:
time anti
time anti
time anti
time anti
time anti
time anti
time anti
2. Recompile your "anti" program with "profiling enabled" and run gprof:
http://www.cs.utah.edu/dept/old/texinfo/as/gprof.html

3. Remember -
a) Linux is NOT a "real-time operating system" (and, of course, neither is Windows )
b) The whole point of "real time" is NOT "instantaneous" or "as fast as possible"...
... rather, the point is "deterministic"
c) And (at the risk of duplicate redundancy ) - Linux is NOT "deterministic"

'Hope that helps

PS:
If you don't have a scientific calculator handy, here's a cute web app for computing standard deviation:
http://www.easycalculation.com/stati...-deviation.php

Last edited by paulsm4; 12-09-2010 at 03:06 PM.
 
Old 12-09-2010, 03:10 PM   #11
jiml8
Senior Member
 
Registered: Sep 2003
Posts: 3,171

Rep: Reputation: 115Reputation: 115
Quote:
Originally Posted by paulsm4 View Post
Sergei is absolutely correct - timings WON'T be accurate to the microsecond. Or probably not even to the millisecond.
CPU time will be. Wall clock time won't be. The time spent executing is CPU time. The time from start to finish is wall clock time.

Quote:
@hillgreen -

Here are a couple of other quick/easy ways to get "timings". It might be interesting to compare the results. And - equally interesting - to compare the variance between successive runs.

1. Use "time" in a shell script
Code:
time anti
time anti
time anti
time anti
time anti
time anti
time anti
That won't be as good as what he already has.

Quote:
2. Recompile your "anti" program with "profiling enabled" and run gprof:
http://www.cs.utah.edu/dept/old/texinfo/as/gprof.html
That will do a lot better than what he has, but so will getrusage() or analyzing /proc/stat.
 
Old 12-09-2010, 03:23 PM   #12
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by jiml8 View Post
Please try to pay attention, and read carefully and correctly.

The execution time probably is constant. ...
No, it is not. And I suggested to answer specific questions in: http://www.linuxquestions.org/questi...6/#post4185742 .

You apparently haven't answered these questions. So, here are my answers.

Modern computers typically use DRAM. "D" stands for "dynamic", meaning capacitive storage elements. Capacitive storage elements need refresh cycles. Refresh cycles are implemented as completely independent HW process, i.e. there is a piece of HW doing this. While a refresh cycle is in progress, the DRAM is inaccessible. I.e. a read or write operation will take more time if it hits a refresh cycle.

Modern CPUs have caches, and a CPU job is to execute instruction and to while doing this to load and store data. Instructions fetches and data load/store operations can occur with/from cache - in such a case they are fast, or with/from RAM - in such a case they are (IIRC) ten(s) of times slower. Whenever the OS switches tasks, cache is essentially repopulated, so task switching does cause variations of execution time.

Modern (and not only modern) computers use DMA extensively. DMA is yet another HW process. If a DMA transfer is in progress, RAM (at least, the affected by DMA bank) is inaccessible to CPU, so it looks to it like slower RAM.

As I have already written many times, taking all this into consideration, execution time is not constant. Because it simply can't be.
...
P.S. I took part in development of DMA block of a pretty complex communications chip; the chip also contained a CPU. It was my day to day job to perform HDL simulations of the chip carefully watching bus activity. DMA typically has higher than CPU bus priority - a network packet can't wait, that's why, for example.
 
Old 12-09-2010, 04:33 PM   #13
jiml8
Senior Member
 
Registered: Sep 2003
Posts: 3,171

Rep: Reputation: 115Reputation: 115
Quote:
Originally Posted by Sergei Steshenko View Post
No, it is not. And I suggested to answer specific questions in: http://www.linuxquestions.org/questi...6/#post4185742 .

You apparently haven't answered these questions. So, here are my answers.

Modern computers typically use DRAM. "D" stands for "dynamic", meaning capacitive storage elements. Capacitive storage elements need refresh cycles. Refresh cycles are implemented as completely independent HW process, i.e. there is a piece of HW doing this. While a refresh cycle is in progress, the DRAM is inaccessible. I.e. a read or write operation will take more time if it hits a refresh cycle.

Modern CPUs have caches, and a CPU job is to execute instruction and to while doing this to load and store data. Instructions fetches and data load/store operations can occur with/from cache - in such a case they are fast, or with/from RAM - in such a case they are (IIRC) ten(s) of times slower. Whenever the OS switches tasks, cache is essentially repopulated, so task switching does cause variations of execution time.

Modern (and not only modern) computers use DMA extensively. DMA is yet another HW process. If a DMA transfer is in progress, RAM (at least, the affected by DMA bank) is inaccessible to CPU, so it looks to it like slower RAM.

As I have already written many times, taking all this into consideration, execution time is not constant. Because it simply can't be.
...
P.S. I took part in development of DMA block of a pretty complex communications chip; the chip also contained a CPU. It was my day to day job to perform HDL simulations of the chip carefully watching bus activity. DMA typically has higher than CPU bus priority - a network packet can't wait, that's why, for example.
Do you know what? That's all true. I won't argue with any of it. But I will put it into context.

If a process has to wait on a memory refresh (which could happen) the process will be marked "not ready", the processor will do a context switch, and therefore the process is not executing. So the time charged against it is wall clock time, not processor time.

As for processor caching, the devil is in the details. I will agree that if process B is waiting and process A winds up using all the onboard cache, then when process B gets the processor back it will have to wait for fetches from RAM. And this will cause some variance in the CPU time associated with the execution, which will be charged against the process since the wait states associated with waiting on RAM are charged against the process. However, even in worst case, the time variance associated with doing this will be a very very small fraction of the total time variance associated with measuring wall clock time, which includes context switches and waits for I/O and so forth.

Similarly, extensive DMA can affect RAM access times, but often enough that will cause a CPU context switch if the data or instructions are not already cached, and again the time is not charged against the process.

So. What you say is true. But it is also a significant impact only at a scale that is ordinarily well below the scale at which the OP is working, and is at best an extremely trivial contributor to the variance OP was asking about, presuming the OP's computer is reasonably modern. How trivial? Very hard to say. But if I had to take a whack at it, I'd place it someplace on the order of a thousandth of a percent or less of OP's indicated variance in a modern PC-class computer. That's just a WAG, and if you can present real numbers, please do so.

Last edited by jiml8; 12-09-2010 at 04:36 PM.
 
Old 12-09-2010, 04:41 PM   #14
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by jiml8 View Post
If a process has to wait on a memory refresh (which could happen) the process will be marked "not ready", the processor will do a context switch, and therefore the process is not executing. So the time charged against it is wall clock time, not processor time.
...

Nonsense. The CPU has no notion of memory refresh. The CPU can't do anything externally while in wait cycles, for example, it can't push current task's registers on stack, it can't process an interrupt.

To the same extent the CPU has no knowledge about DMA.

...

Spend about $100 or something like that (well, more - you'll need an oscilloscope too), buy a cheap development board for a simple controller - preferably without pipeline, and do something on bare metal from scratch. I.e. write your own BIOS first.

For me it was quite revealing in the late eighties to develop an in-circuit emulator and to use it to debug HW and SW.

Last edited by Sergei Steshenko; 12-09-2010 at 04:45 PM.
 
Old 12-09-2010, 05:20 PM   #15
paulsm4
Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Hi -

I agree:
Quote:
As I have already written many times, taking all this into consideration, execution time is not constant. Because it simply can't be.
Quote:
Spend about $100 or something like that (well, more - you'll need an oscilloscope too), buy a cheap development board for a simple controller - preferably without pipeline, and do something on bare metal from scratch. I.e. write your own BIOS first.
Or, even better:

* Take your favorite (CPU- and memory- intensive) program, and time it any three or four or five ways you like.

* Run it ten or 100 or 1000 times for each timing method you choose.

* Compute the variation for each method.

I'll bet your timings might be a lot closer than hillgreen's (with his fork()'s and "wait()'s", which introduce a HUGE amount of latency). But I seriously doubt they'll consistently line up to the nearest millisecond, either
 
  


Reply

Tags
kernel, precision, process, time, timing


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Problem using realloc multiple times. pgpython Programming 6 03-03-2010 06:09 PM
Assembly program runs 81 times *slower* with 13 fewer instructions. Travis86 Programming 11 11-15-2008 12:30 PM
Howto use getchar() multiple times? daYz Programming 17 07-27-2007 06:15 AM
Konqueror opens multiple times sploit Linux - Newbie 18 07-14-2007 09:11 AM
same email, multiple times ?? (exchange... sorry) itsjustme General 1 01-14-2005 03:33 PM


All times are GMT -5. The time now is 01:43 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration