LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-29-2005, 08:23 PM   #1
rovitotv
Member
 
Registered: Jan 2003
Location: Dayton, Ohio
Distribution: Slackware 10.2
Posts: 116

Rep: Reputation: 15
Benchmarking Small Instructions


I understand how to benchmark with the time function an algorithm but what if the algorithm is very fast? Faster than the resolution of the clock. For example suppose I want to benchmark the following insturction

int x
int y
int z

x = y + z;

I can't use the standard time function call in C because the resolution of the clock is not fine enough. If I have a 2 GHZ CPU this instruction should take around 4 clock ticks ~ (4 x .5ns = 2 ns). I figure with a 2 GHZ CPU a single instruction should take 1/2,000,000,000 ns = .5 ns . These are just back of the envelope calculations. The logical solution would be to run the instruction in a loop and calculate the run time as LOOP_WITH_INST. I would then run the same program loop but no instruction and call that run time LOOP. If I subtract LOOP_WITH_INST - LOOP I will get just the run time of the instruction, call it RUN_TIME executed X amount of times. Using the following equation I can calculate the number of ns that single instruction takes.
(RUN_TIME/X) * 1,000,000,000 = number of NS for that single instruction.

When I try this on my AMD 2400+ running Slackware 10 I don't get anywhere near the 2 ns, my final answer is 0.0125 ns. I know I can expect some performance improvement due to cache but I didn't expect this much. Any ideas? The code I used is listed below:

My calculations
LOOP = 234
LOOP_WITH_INST = 235
RUN_TIME = 1

(1/80,000,000,000) * 1,000,000,000 = 0.0125 ns




//loop.c ================================================
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <math.h>

int main()
{
int x = 1;
int y = 1;
int z = 1;
int oc;
long int count;
time_t start;
time_t end;
double time_secs;

start = time(NULL);
for (count = 0; count < 2000000000; oc++)
{
for(oc = 0; oc < 40; oc++)
{
}
}
end = time(NULL);
time_secs = difftime(end, start);
printf("total execution time in seconds:%f minutes:%f \n",
time_secs, (time_secs/(double)60));

return 0;
}

loop_with_inst.c ===========================================
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <math.h>

int main()
{
int x = 1;
int y = 1;
int z = 1;
int oc;
long int count;
time_t start;
time_t end;
double time_secs;

start = time(NULL);
for (count = 0; count < 2000000000; oc++)
{
for(oc = 0; oc < 40; oc++)
{
x = y + z;
}
}
end = time(NULL);
time_secs = difftime(end, start);
printf("total execution time in seconds:%f minutes:%f \n",
time_secs, (time_secs/(double)60));

return 0;
}
 
Old 01-29-2005, 10:35 PM   #2
jtshaw
Senior Member
 
Registered: Nov 2000
Location: Seattle, WA USA
Distribution: Ubuntu @ Home, RHEL @ Work
Posts: 3,892
Blog Entries: 1

Rep: Reputation: 67
I'm not sure why you think an add will take 4 clock ticks. In fact, if you do enough back to back adds it should average out to more like 1 clock tick because of the pipelining going on in the processor.

Also... make sure you do a gcc -S on that to see what the assembly output is... make sure it isn't optimizing all those instructions out.
 
Old 01-30-2005, 07:30 AM   #3
rovitotv
Member
 
Registered: Jan 2003
Location: Dayton, Ohio
Distribution: Slackware 10.2
Posts: 116

Original Poster
Rep: Reputation: 15
I assume the following:

x = y + z;

takes four clock ticks because
1 clock tick to load y
1 clock tick to load z
1 clock tick to add
1 clock tick to put result in x

I could be wrong it has been awhile since I took a computer arch class. Even if the entire operation takes a single clock tick the resulting calculation should be .5 ns not .0125 ns. The operation is to fast that is why I think I have something wrong. Thanks for the help.
 
Old 01-30-2005, 08:15 AM   #4
gr33ndata
Member
 
Registered: Aug 2003
Location: DMZ
Distribution: Ubuntu
Posts: 144

Rep: Reputation: 15
I think you hav e to convert that code into assembly first, then see
 
Old 01-30-2005, 11:17 AM   #5
jtshaw
Senior Member
 
Registered: Nov 2000
Location: Seattle, WA USA
Distribution: Ubuntu @ Home, RHEL @ Work
Posts: 3,892
Blog Entries: 1

Rep: Reputation: 67
Code:
1 clock tick to load y
1 clock tick to load z
1 clock tick to add
1 clock tick to put result in x
Well, your partially right. The loads both happen at the same time. However, from an architecture standpoint, it still only takes 1 tick to do the add. Because there isn't just one instruction in the pipeline at the same time. There's one at fetch, one at decode, and so on. So over the course of a million instructions, providing you can keep the pipeline full (and you don't get pre-empted), you should see an average of about 1 add per clock cycle.

I had to do some benchmarking on a XScale last fall and the way we did it was using the high resolution timer and we did all the measurements from without a kernel module while holding all the locks. This ensured that nobody accept the interrupt handlers could kick us of the processor.

Also, compile your going into assembly and make sure the compiler isn't doing any voodoo magic:-P. (the -S option on gcc spits out the assembly). Whenever I do benchmarks I specify -O0 and make all the important vars volatile.

Last edited by jtshaw; 01-30-2005 at 11:20 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Benchmarking goldfish_memory Linux - General 1 11-18-2005 12:45 PM
Benchmarking an application JAllegraud Programming 2 11-04-2004 09:24 AM
benchmarking my code spuzzzzzzz Programming 2 09-16-2004 06:01 AM
benchmarking linux new_linux_goof Linux - General 0 05-14-2002 08:59 PM
Stress benchmarking linear Linux - General 0 01-13-2002 01:44 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:12 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration