System.time?

alaios · 03-29-2011, 05:13 AM

Dear all.
(inside R) I am using system.time function to measure the execution time of two functions

> system.time(lapply(seq(1:1000000),returni))
user system elapsed
0.982 0.100 1.057

> system.time(mclapply(seq(1:1000000),returni))
user system elapsed
0.827 0.607 1.502

As this system.time uses the unix time functions I would like to ask your help understand the output (user,system,elapses) so to figure out which of the two functions is faster or not.

Regards
Alex

16pide · 03-29-2011, 11:09 AM

my understanding is this:
user is the time the processor spent running your application
system is the time the system used the processor to service system calls that your program used
elapse is the time difference between program start and program end

user time can be smaller than elapse because other programs are also running on the system at the same time.
also, if the program asks for data from disk for example, then while that data is coming into memory, the processor switches to other programs that could use those processor cycles.

You need to research on schedulers, multitasking, system calls, etc ...

Anyway, in your case above, the fast program is lapply because user+system is lower than with mclapply, so it uses less processor (so potentially less electricity).
More importantly for you, elapsed time is lower, so if you include it in a script, your script can get on with the next command half a second before. It would really become important if you're running that command thousands of times in a loop in a script.

alaios · 04-11-2011, 04:30 AM

Quote:

Originally Posted by 16pide

my understanding is this:
user is the time the processor spent running your application
system is the time the system used the processor to service system calls that your program used
elapse is the time difference between program start and program end

user time can be smaller than elapse because other programs are also running on the system at the same time.
also, if the program asks for data from disk for example, then while that data is coming into memory, the processor switches to other programs that could use those processor cycles.

You need to research on schedulers, multitasking, system calls, etc ...

Anyway, in your case above, the fast program is lapply because user+system is lower than with mclapply, so it uses less processor (so potentially less electricity).
More importantly for you, elapsed time is lower, so if you include it in a script, your script can get on with the next command half a second before. It would really become important if you're running that command thousands of times in a loop in a script.

If I got it right I should be mostly intrested about elapsed time. I want a script/program to run as fast as possible.I do not care about power consumption or how I much stress my cpu.

Do you agree with that ?

regards
Alex

16pide · 04-12-2011, 03:13 AM

Quote:

Originally Posted by alaios

If I got it right I should be mostly intrested about elapsed time. I want a script/program to run as fast as possible.I do not care about power consumption or how I much stress my cpu.

Do you agree with that ?

regards
Alex

Probably, yes. Unless you want to guess how it will behave when the system is loaded heavily with other programs, or you run the command multiple times in parallel.
Multiple processes cannot be using the same processor at the exact same time, but they can be waiting at the same time. Especially if they are not waiting for the same resource, for example one process waits for graphic IO while the other waits for disk IO.

Suppose you're on a mono-cpu machine, your process does 100% user time, no system, no wait, and it lasts one minute.
run it twice in parallel, it will take 2 minutes to run.

Now suppose your process does 1% user time, no system, and 99% wait on the clock, and it lasts one minute.
run it a hundred time in parallel, it will take 1 minute to run. The command would be:
sleep 60

try it with:
time sleep 60

That's a bit extreme, but I think it explains the concept.

Have fun!

Walter.Stroebel · 04-12-2011, 03:34 AM

Using elapsed time also includes any disk or other I/O which means you might get unpredictable results.
Eg. "time wget some_url" will almost entirely depend on your Internet connection and the server at the other side.
If you are disk-bound (waiting on the disk) you might want to

apt-get install sysstat
or
yum install sysstat

and keep

iostat -k 10

running in another window/console to see disk I/O you are causing.

Regards,
Walter