LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 11-02-2014, 11:22 AM   #1
moisespedro
Senior Member
 
Registered: Nov 2013
Location: Brazil
Distribution: Slackware
Posts: 1,223

Rep: Reputation: 195Reputation: 195
Multiple cores/optimization in C


Hi, I've wrote this as an university exercise. It is a program to calculate and say if a number is perfect or not.

Code:
#include <stdio.h>

int numero_perfeito(unsigned long int x) {
    
    unsigned long int i, soma = 0;
    for(i = 1; i < x ; i++) {
        if (x % i == 0)
        soma = soma + i;
    }   

    if(soma == x)
        return 1;
    return 0;
}

main() {
    unsigned long int a;
    int retorno;
    
    printf("Digite um numero:\n");
    scanf("%lu", &a);

    retorno = numero_perfeito(a);
    if(retorno) 
        printf("Numero perfeito!\n");
    else
        printf("Meh, nao e' perfeito\n");
}
It takes a number typed in by the user, calls "numero_perfeito" function and returns 1 (perfect number) or 0 (not perfect number). While running it with large numbers, such as 2305843008139952128 it uses 100% of one CPU core and it takes quite some time to calculate (it is running for more than 9 minutes at the time of this post). How can I make this program faster? Can I set it to use all CPU cores?
Attached Thumbnails
Click image for larger version

Name:	Screenshot at 2014-11-02 15:21:36.png
Views:	14
Size:	21.8 KB
ID:	16805   Click image for larger version

Name:	Screenshot at 2014-11-02 15:21:41.png
Views:	14
Size:	10.5 KB
ID:	16806  

Last edited by moisespedro; 11-02-2014 at 11:24 AM. Reason: Attachments
 
Old 11-02-2014, 11:45 AM   #2
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,863
Blog Entries: 1

Rep: Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869
Use sqrt function to determine the square-root of the number:

Code:
#include <math.h>

int numero_perfeito(unsigned long int x) {
    
    unsigned long int i, soma = 0;
    unsigned long maxdiv= sqrt (x);

    for(i= 1; i <= maxdiv; i++) {
        if (x % i == 0) {
            soma = soma + i;
            if (i != 1 && i != x/i) {
                soma = soma + x/i;
            }
        }
    }   

    if(soma == x)
        return 1;
    return 0;
}
 
1 members found this post helpful.
Old 11-02-2014, 11:54 AM   #3
moisespedro
Senior Member
 
Registered: Nov 2013
Location: Brazil
Distribution: Slackware
Posts: 1,223

Original Poster
Rep: Reputation: 195Reputation: 195
I don't get it. I tested here and it worked but I am not sure why.

Like, take 28 for example, it is a perfect number. maxdiv is sqrt of 28 which is equal to 5.29150262213 but 28 is also divisible by 7.
 
Old 11-02-2014, 12:19 PM   #4
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,863
Blog Entries: 1

Rep: Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869
when i=4, you program will execute this:
Code:
sum := sum + 4 + 28/4
 
Old 11-02-2014, 04:55 PM   #5
moisespedro
Senior Member
 
Registered: Nov 2013
Location: Brazil
Distribution: Slackware
Posts: 1,223

Original Poster
Rep: Reputation: 195Reputation: 195
I am still not sure I got it. Anyways, it still takes some time.
 
Old 11-02-2014, 06:58 PM   #6
metaschima
Senior Member
 
Registered: Dec 2013
Distribution: Slackware
Posts: 1,982

Rep: Reputation: 492Reputation: 492Reputation: 492Reputation: 492Reputation: 492
Here's my multi-threaded version, there's a bit of a problem with number 6, because how can you even split that between say 4 threads ? If you can figure out an algorithm, good for you.

Code:
bash-4.2$ compile pn.c -fopenmp

# 1 thread    
bash-4.2$ time ./pn 8589869056
8589869056 is a perfect number

real	1m3.830s
user	1m3.849s
sys	0m0.000s

# 4 threads
bash-4.2$ time ./pn 8589869056
8589869056 is a perfect number

real	0m16.885s
user	1m7.395s
sys	0m0.004s
That's 3.78 times faster.

Code:
#include <stdio.h>
#include <stdint.h>

#include <omp.h>

#define NUM_THREADS 4

int main (int argc, char * argv[])
{
	uint64_t i;
	uint64_t num;
	uint64_t tnum[2][NUM_THREADS];
	unsigned int tid;

	uint64_t sum[NUM_THREADS] = {};
	uint64_t total = 0;

	// parse arg
	if (2 != argc)
	{
		fprintf (stderr, "Usage: %s number\n", argv[0]);
		return 1;
	}

	// get num
	sscanf (argv[1], "%lu", &num);

	if (6 == num)
	{
		printf ("6 is a perfect number\n");
		return 0;
	}

	for (i = 0; i < NUM_THREADS; i++)
	{
		tnum[1][i] = (i + 1) * num / NUM_THREADS;
		tnum[0][i] = tnum[1][i] - (num / NUM_THREADS);
		//printf ("%lu %lu\n", tnum[0][i], tnum[1][i]);
	}
	tnum[0][0] = 1;
	tnum[1][NUM_THREADS - 1] = num;
	//printf ("%lu\n", tnum[1][NUM_THREADS - 1]);

	// main loop
	#pragma omp parallel private(i, tid) shared(num, sum, tnum) num_threads(NUM_THREADS)
	{
		tid = omp_get_thread_num();
		//printf ("%d %lu %lu %lu\n", tid, tnum[0][tid], tnum[1][tid], num);
		for (i = tnum[0][tid]; i < tnum[1][tid]; i++)
		{
			//printf ("%d %lu\n", tid, i);
			if (0 == num % i)
			{
				sum[tid] += i;
				//printf ("adding %lu\n", i);
			}
		}
	}

	// calculate total
	for (i = 0; i < NUM_THREADS; i++)
	{
		total += sum[i];
		//printf ("%lu\n", sum[i]);
	}
	//printf ("%lu\n", total);

	// output result
	if (total == num)
	{
		printf ("%lu is a perfect number\n", num);
	}
	else
	{
		printf ("%lu is NOT a perfect number\n", num);
	}

	return 0;
}
Not fully bug tested, but don't have time for that.
 
1 members found this post helpful.
Old 11-02-2014, 07:52 PM   #7
moisespedro
Senior Member
 
Registered: Nov 2013
Location: Brazil
Distribution: Slackware
Posts: 1,223

Original Poster
Rep: Reputation: 195Reputation: 195
Gonna give your version a shot. I am a still a C begginer and I am trying to understand the code.

Did you try a really large number like 2305843008139952128?

I left it running for hours and it didn't finish (I ended up turning off the PC without noticing it ).

My last attempt:

Code:
$ time ./numeroperfeito-1                    
Digite um numero:
2305843008139952128
^C
real	175m1.960s
user	174m44.540s
sys	0m2.831s
EDIT: And how rude from my part, many thanks for your code (you both)

Last edited by moisespedro; 11-02-2014 at 07:54 PM.
 
Old 11-02-2014, 08:27 PM   #8
metaschima
Senior Member
 
Registered: Dec 2013
Distribution: Slackware
Posts: 1,982

Rep: Reputation: 492Reputation: 492Reputation: 492Reputation: 492Reputation: 492
Remember that finding primes and perfect numbers is computationally very expensive. Don't expect the calculation of higher values to be doable on a desktop in a day. If you wanted even more optimization, you could try using some MMX or SSE, but I've found that it is very difficult, requires knowledge of assembly, is less portable, and only with an expert's touch can it truly outperform a compiler except in obvious cases (and the compiler is likely to figure these out anyway). If I have more time, I'll see how plausible it is to use MMX or SSE for this. The main problem is that loading and unloading from these registers is slower than for other registers, so you should keep data in the registers if you want a performance improvement.
 
Old 11-03-2014, 01:14 AM   #9
a4z
Senior Member
 
Registered: Feb 2009
Posts: 1,727

Rep: Reputation: 742Reputation: 742Reputation: 742Reputation: 742Reputation: 742Reputation: 742Reputation: 742
the openmp sample is already nice and goes into the right direction,
additional
try help the compiler by passing
-funroll-loops -ftree-vectorize -march=corei7-avx (orwhat ever cool cpu you have)
and check if your loops are written in a way that the compiler can unroll and verctorize them, do not write sse code by hand, help the copilerr to generate it
additionally, no optimization runs out the best, or a better , algorithm,
so you can also check it the way you solve the problem is the best to do
 
Old 11-03-2014, 11:14 AM   #10
nbritton
Member
 
Registered: Jun 2013
Location: Dubuque, IA
Distribution: Red Hat Enterprise Linux, Mac OS X, Ubuntu, Fedora, FreeBSD
Posts: 89

Rep: Reputation: Disabled
If I recall correctly, all, even, positive perfect numbers can be factored with the equation y = 2^(2x - 1) - 2^(x - 1) when x is a mersenne prime number.

Quote:
x = mersenne prime
x = x, y = 2^(2x - 1) - 2^(x - 1)
x = 2, y = 2^3 - 2^1 = 6
x = 3, y = 2^5 - 2^2 = 28
x = 5, y = 2^9 - 2^4 = 496
x = 7, y = 2^13 - 2^6 = 8128
x = 13, y = 2^25 - 2^12 = 33,550,336
x = 17, y = 2^33 - 2^16 = 8,589,869,056
x = 19, y = 2^37 - 2^18 = 137,438,691,328
x = 31, y = 2^61 - 2^30 = 2,305,843,008,139,952,128

Last edited by nbritton; 11-03-2014 at 12:03 PM.
 
Old 11-03-2014, 11:33 AM   #11
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by nbritton View Post
If I recall correctly, all positive perfect numbers can be factored with the equation y = 2^(2x - 1) - 2^(x - 1) when x is a prime number.
All even perfect numbers can be factored with the equation
y = 2^(x-1) * (2^x-1)
where x is prime and (2^x-1) is also prime.

Your equation is equivalent.

y is perfect if and only if (2^x-1) is prime.

So the search for even perfect numbers is equivalent to searching prime numbers x for those for which (2^x-1) is prime. That is much much faster than directly testing whether y is perfect.

There is no proof (so far as I know) that odd perfect numbers don't exist. But it is pretty easy to prove no odd perfect number is small enough to be found with a simplistic search like the programs discussed earlier in this thread (certainly no odd perfect number is smaller than 2^64, so unsigned long could not hold one). So for practical purposes, perfect numbers are even.
 
Old 11-03-2014, 12:23 PM   #12
metaschima
Senior Member
 
Registered: Dec 2013
Distribution: Slackware
Posts: 1,982

Rep: Reputation: 492Reputation: 492Reputation: 492Reputation: 492Reputation: 492
Quote:
Originally Posted by johnsfine View Post
But it is pretty easy to prove no odd perfect number is small enough to be found with a simplistic search like the programs discussed earlier in this thread (certainly no odd perfect number is smaller than 2^64, so unsigned long could not hold one). So for practical purposes, perfect numbers are even.
That's one thing I was wondering about, it is likely that even 64-bit numbers will overflow. Not sure how to fix this, you many need some library that can handle larger numbers.

If you really wanted to find more perfect numbers, why not try to construct pernicious numbers and test them ? Like the wiki says:
6 = 110
28 = 11100
496 = 111110000
8128 = 1111111000000
33550336 = 1111111111111000000000000
 
Old 11-03-2014, 01:25 PM   #13
nbritton
Member
 
Registered: Jun 2013
Location: Dubuque, IA
Distribution: Red Hat Enterprise Linux, Mac OS X, Ubuntu, Fedora, FreeBSD
Posts: 89

Rep: Reputation: Disabled
http://primes.utm.edu/mersenne/

Another optimization you can make is that, even, perfect numbers end in ether 6 or 8, the pattern of the series can be described with the equation y = 8 - mod(x + x^2, 4).

For example:
Code:
$ for x in $(cat mersenne_powers); do ((echo "$x, `bc <<< "2^($x-1) * (2^$x-1)" | tail -c 2`")&); done
2, 6
3, 8
5, 6
7, 8
13, 6
17, 6
19, 8
31, 8
61, 6
89, 6
107, 8
127, 8
521, 6
607, 8
1279, 8
2203, 8
2281, 6
3217, 6
4253, 6
4423, 8
9689, 6
9941, 6
11213, 6
19937, 6
21701, 6
23209, 6
44497, 6
86243, 8
110503, 8
132049, 6
216091, 8
756839, 8
859433, 6
1257787, 8
1398269, 6
2976221, 6
3021377, 6
6972593, 6
13466917, 6
20996011, 8
24036583, 8
25964951, 8
30402457, 6
32582657, 6
37156667, 8
42643801, 6
43112609, 6
57885161, 6

Last edited by nbritton; 11-03-2014 at 10:56 PM.
 
Old 11-03-2014, 08:34 PM   #14
nbritton
Member
 
Registered: Jun 2013
Location: Dubuque, IA
Distribution: Red Hat Enterprise Linux, Mac OS X, Ubuntu, Fedora, FreeBSD
Posts: 89

Rep: Reputation: Disabled
Quote:
Originally Posted by johnsfine View Post
All even perfect numbers can be factored with the equation
y = 2^(x-1) * (2^x-1)
where x is prime and (2^x-1) is also prime.

Your equation is equivalent.
According to bc, your equation is faster to compute on x86_64...
Code:
$ time echo "2^(10000000-1) * (2^10000000-1)" | bc > /dev/null

real	10m51.011s
user	10m50.292s
sys	0m0.589s


$ time echo "2^(2*10000000-1) - 2^(10000000-1)" | bc > /dev/null

real	11m37.377s
user	11m36.306s
sys	0m0.861s
 
Old 11-04-2014, 07:44 AM   #15
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by nbritton View Post
According to bc, your equation is faster to compute on x86_64...
Computing that isn't a significant fraction of the time regardless of which way you do it.

The significant time is in testing whether (2^X-1) is prime.

Testing whether 2^(X-1)*(2^X-1) is perfect is mathematically equivalent to testing whether (2^X-1) is prime, but since 2^(X-1)*(2^X-1)is a far larger number, directly testing whether it is perfect would be far slower than directly testing whether (2^X-1) is prime.

After discovering that (2^X-1) is prime, computing 2^(X-1)*(2^X-1) is trivial, then converting it to decimal for display is bit harder, but still trivial compared to discovering (2^X-1) is prime.

Last edited by johnsfine; 11-05-2014 at 12:22 PM. Reason: Some of the ^ marks had somehow gotten dropped
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
using multiple cores when installing via sbopkg Cultist Slackware 1 09-23-2011 10:47 AM
How does software know how to use Multiple CPU and Cores Ronayn Linux - Software 4 12-20-2010 05:36 AM
How does the kernel handle multiple cpus/cores? riddler313 Linux - General 4 06-18-2009 08:35 AM
configuring MYSQL 5.0.67 to use multiple cores (processors) nino_of_qubic Linux - Server 3 03-31-2009 11:29 PM
linux on multiple cores reslowgr Linux - Server 1 01-20-2007 03:35 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 03:39 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration