A problem of benchmarking a computer system is that almost whatever you choose to do the benchmark will be unrealistic in some way. Back in the day, when folks were running 8080 and Z-80 processors on "small" machines (with CP/M, Control Program for Microprocessors or similar operating systems, see
http://en.wikipedia.org/wiki/CP/M), perhaps 32- to 64 kilobytes of RAM, floppy disks (or maybe 10-to-50 megabyte hard disk drives) a favorite algorithm for testing the speed of the system was the seive of Erastophanes, an algorithm for finding prime numbers up to a given limit.
You would load up a
seive program written in assembly language or BASIC or, as time went on, C, Pascal or something else and execute it: the higher the number of seives you got, the faster the computer system. The problem was that you were only running one process (no multitasking there) so the only thing you could conclude was that machine X was faster that Machine Y (or A, B, C, Z, whatever). Didn't really tell you much of anything about how anything else would perform (see
http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes for a fuller discussion): we're talking the 70's and 80's here.
As time went on machines got better and faster and "real" operating systems became available on microcomputer; by "real" we mean multiuser, multitasking operating systems, Unix and others and benchmarks became all the rage. Vendors were selling computer systems into given markets based upon how a given benchmark performs on their hardware and software: You're doing engineering, here's how our box performs running benchmark X. Sold a lot of computer systems.
However, the problem of benchmarking remained: executing an isolated benchmark program with no other activity on the system does not produce truly meaningful results, a hint maybe, but not what's going to happen when you have 30 or more users doing whatever they're doing when this particular application executes.
I spent a lot of my time as a software engineer optimizing code (usually in small chunks). I worked in FORTRAN and C, later exclusively in C and SQL on multiterabyte data bases (plural: something like 20 data bases all pretty much the same size being accessed simultaneously by 20-30 users). Data base design is more an art than a science and query optimization can be a frustrating process; the only test is did I get the correct result and did this query run faster than the older query? Nope, well give 'er anther shot until you get to the "best" balance of resources and speed.
When we're running a single user system (which, I suspect, most of us are most of the time), we experience sometimes clunky performance by a given application. Might be memory limits, might be processor limits, might be disk I/O limits, who knows; could be just a monster program designed and developed by not so hot programmers (including ourselves: been there, done that, paid the price doing it over). If you're running finite element analysis on the Golden Gate bridge, it's going to take a while and other things are going to slow down, eh?
The Linux system is pretty darned efficient -- lots of daemons sleeping, waking up every so ofter, doing something then going back to sleep. Ever better kernel, ever better (usually) applications, ever better compilers, ever better hardware.
As suggested by @genss above, Hierarchical INTegration (HINT) might be an answer to the benchmarking conundrum:
Quote:
Hierarchical INTegration, or HINT for short, is a computer benchmark that ranks a computer system as a whole (i.e. the entire computer instead of individual components). It measures the full range of performance, mostly based on the amount of work a computer can perform over time.
|
I suspect that HINT is a step in the right direction (and I'm going to download and give it a try when I've got some time after reading the links in the Wikipedia article link given by @genss). Might just be more meaningful.
Hope this helps some.