Hi. I'm a Unix Administrator, mathematics enthusiast, and amateur philosopher. This is where I rant about that which upsets me, laugh about that which amuses me, and jabber about that which holds my interest most: Unix.

# Fun with prime numbers

When I was in college I wrote a silly benchmark to test different programming languages. The task was to brute-force calculate the first 150,000 prime numbers. I wrote the program in C, C++, Java, Perl, and Python. I fudged an optimization on them all (the optimization: only run division calculations up to sqrt(n) for a given prime candidate n...since sqrt is costly to calculate, I used n/3 (a horrid estimate, I know...but it still cuts out quite a few tests)). Recently I got a chance to discuss the primes algorithm with a programmer that I didn't know. He pointed out the obvious fault in using n/3 and suggested sqrt(n). While explaining the concept and why I used n/3, another idea hit me...why not keep a rolling array of the squares of the primes I've found so far, and use them to estimate? This way, I don't have to calculate sqrt(n) at all (sq(n-1) is larger than n, so I'd have to use some value sq(n-x), which I'll already have passed over, meaning I'd have already been able to calculate sq(n-x) (and squaring a number is cheap...unlike determining the square root of a given number)). The idea is simple...instead of calculating sqrt(n) when needed, build a lookup table as I advance and find more primes...then use the square of smaller primes to estimate the square root of the candidate number I'm dealing with now...(i.e. if sq(n-x) is roughly the same as n (but not smaller), then n-x must be equal to or slightly larger than the square root of n). Make sense?

The speed up is pretty impressive...

written in C, single-thread:

# ./primes_old >old.txt

# ./primes_new >new.txt

# diff -c old.txt new.txt

*** old.txt Tue Dec 28 15:17:33 2010

--- new.txt Tue Dec 28 15:17:48 2010

***************

*** 149993,149996 ****

2015141

2015149

2015161

! The test took a total of: 63.000000 seconds to generate the first 150000 prime numbers.

--- 149993,149996 ----

2015141

2015149

2015161

! The test took a total of: 1.000000 seconds to generate the first 150000 prime numbers.

Edit - I went back and added a "pure" sqrt version to test. The result is that sqrt isn't as costly as I had assumed. There must be some good math voodoo going on in there, because the sqrt version was right on the heels of the enhanced original (slightly slower, but not dramatically slower as I'd assumed it would be).

The speed up is pretty impressive...

written in C, single-thread:

# ./primes_old >old.txt

# ./primes_new >new.txt

# diff -c old.txt new.txt

*** old.txt Tue Dec 28 15:17:33 2010

--- new.txt Tue Dec 28 15:17:48 2010

***************

*** 149993,149996 ****

2015141

2015149

2015161

! The test took a total of: 63.000000 seconds to generate the first 150000 prime numbers.

--- 149993,149996 ----

2015141

2015149

2015161

! The test took a total of: 1.000000 seconds to generate the first 150000 prime numbers.

Edit - I went back and added a "pure" sqrt version to test. The result is that sqrt isn't as costly as I had assumed. There must be some good math voodoo going on in there, because the sqrt version was right on the heels of the enhanced original (slightly slower, but not dramatically slower as I'd assumed it would be).

Total Comments 0