LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   clock gettime variation (https://www.linuxquestions.org/questions/programming-9/clock-gettime-variation-869142/)

rk_linux 03-17-2011 06:30 AM

clock gettime variation
 
Hi,

I need high precision timer which returns the accurate value
till 99.99 percentile. I am running clock_gettime(MONOTONIC_RAW,..)
in a tight loop and measuring the timing difference between consecutive calls.

For 99.8 percentile I see timediff of 100 nano sec but then there are few values(5 out of 10000 calls) which are in millisec. At this moment I am clueless what may be the reason behind it.

I am using linux debian kernel 2.6.28, with core scheduling on core 0. The timing measurement are done with program tied to the other core. I have also disabled ntp.

Any possible explanation what may be the reason for such measurement
outliers?


Thanks !

Sergei Steshenko 03-17-2011 12:26 PM

Quote:

Originally Posted by rk_linux (Post 4293682)
Hi,

I need high precision timer which returns the accurate value
till 99.99 percentile. I am running clock_gettime(MONOTONIC_RAW,..)
in a tight loop and measuring the timing difference between consecutive calls.

For 99.8 percentile I see timediff of 100 nano sec but then there are few values(5 out of 10000 calls) which are in millisec. At this moment I am clueless what may be the reason behind it.

I am using linux debian kernel 2.6.28, with core scheduling on core 0. The timing measurement are done with program tied to the other core. I have also disabled ntp.

Any possible explanation what may be the reason for such measurement
outliers?


Thanks !

What assumption do you have WRT time measurements and program execution time ? I.e. do you expect execution time of the same program/piece of code to be the same from call o call ? What do you know about how computer HW works ? Do you understand hoW OS works WRT task switching ?

rk_linux 03-17-2011 12:54 PM

In this context, what I am measuring is the overhead of syscall clock_gettime(). I am pining
the measurement program to a given core. I have restricted kernel scheduling to just one core (not same as measurement
program core). By this I am making sure that my task is not rescheduled.

I think I should pose my questions differently, what is the possible explanation of this variation and what could be
done to minimize it?

The variation from nanosec to millisec will have huge impact if you are taking real time measurements.

Sergei Steshenko 03-17-2011 03:14 PM

Quote:

Originally Posted by rk_linux (Post 4294165)
In this context, what I am measuring is the overhead of syscall clock_gettime(). I am pining
the measurement program to a given core. I have restricted kernel scheduling to just one core (not same as measurement
program core). By this I am making sure that my task is not rescheduled.

I think I should pose my questions differently, what is the possible explanation of this variation and what could be
done to minimize it?

The variation from nanosec to millisec will have huge impact if you are taking real time measurements.

Regarding
Quote:

I am using linux debian kernel 2.6.28, with core scheduling on core 0. The timing measurement are done with program tied to the other core.
- does this mean that the program doing measurements is never suspended/resumed by kernel ?

My understanding of the quote is that kernel is pinned to one core, and your program to the other, but IMO this doesn't mean your program can not be suspended/resumed. Suppose your program needs IO, an the IO device is busy - your program will be suspended by kernel until IO is available. Also, I think that pinning a program to a core does not prevent other programs from using the same core. I.e. I think that pinning a program to a core reduces task switching overhead of the pinned program, but does not prevent other programs from using the same core.

Now, the function you've chosen, according to my understanding of its manpage, gives you wallclock duration (and not CPU cycles duration). So any task suspension/resumption can cause such and bigger variations.

rk_linux 03-17-2011 03:43 PM

[QUOTE=Sergei Steshenko;4294292]Regarding
- does this mean that the program doing measurements is never suspended/resumed by kernel ?

Sorry for not being clear earlier, I have excluded all but one of the CPU cores from the default scheduler, and ensured that the measurement code runs on the other core. That excludes the possibility of that core being used by other
program or the program being suspended/resumed.

Sergei Steshenko 03-17-2011 08:42 PM

Quote:

Originally Posted by rk_linux (Post 4294320)
...That excludes the possibility of that core being used by other
program or the program being suspended/resumed.

So, again, suppose your supposedly excluded program executes a, say' 'printf' statement, which ultimately translates into a system call. Since you have a system call, how can you prevent the system (kernel) from suspending/resuming your task ?

I am writing my conclusions based on my general understanding how things work; my understanding can be wrong. If you think your understanding of why/how you can achieve the sate that your task is neither suspended no resumed id correct, could you point me to some documentation stating that ?


All times are GMT -5. The time now is 06:42 AM.