[SOLVED] Mapreduce

vardhan22 · 04-19-2012, 01:51 AM

Hey all,
I want to implement simple map-reduce functionality in c.. I am not using multiple CPUs. I want just to understand how to do mapping and how to reduce the result.

Till now, i got an idea, i can divide the work among multiple threads and can get the result and reduce it. But, i am not able to understand the basic functionality of threads, how they work. I am creating threads using pthread_create, but as i could not understand proper working of thread, i am not getting what to do now.

I have a simple idea that, suppose i have an array of 10000 elements, and i want to get the sum. I can use around 10 threads, which can work independently, and can return the sub-array sum.
But, how to check which thread is working, and which thread has worked for how many elements??

Can any one help me regarding that?

pan64 · 04-19-2012, 02:44 AM

here is a tutorial: https://computing.llnl.gov/tutorials/pthreads/
here is another one: http://www.yolinux.com/TUTORIALS/Lin...ixThreads.html
do not try to write a multithreaded program if you are not familiar with threads. it will hardly work.

vardhan22 · 04-19-2012, 05:34 AM

Sir,
i am not new to thread-programming.. i am just not sure about the exact working of calling routines.
i am able to find the sum of 1000 elements correctly with 10 threads each providing the sum of 100 elements, but i want that i create some threads, say 10, they divide the work among themselves, and give me the sum of these 1000 elements, i need not to provide that every thread should calculate sum of 100 elements by passing argument from pthread_create.
so, how to do that?
help me..
Thank you

vardhan22 · 04-19-2012, 06:12 AM

Quote:

Originally Posted by pan64

here is a tutorial: https://computing.llnl.gov/tutorials/pthreads/
here is another one: http://www.yolinux.com/TUTORIALS/Lin...ixThreads.html
do not try to write a multithreaded program if you are not familiar with threads. it will hardly work.

Sir,
i am not new to thread-programming.. i am just not sure about the exact working of calling routines.
i am able to find the sum of 1000 elements correctly with 10 threads each providing the sum of 100 elements, but i want that i create some threads, say 10, they divide the work among themselves, and give me the sum of these 1000 elements, i need not to provide that every thread should calculate sum of 100 elements by passing argument from pthread_create.
so, how to do that?
help me..
Thank you

pan64 · 04-19-2012, 06:18 AM

I think you need to divide the work (you can say: there should be a master thread) and all the worker threads will do their jobs. for example a worker thread can sum up 100 elements in a list and stores the result. The master thread will start 10 identical threads just all of them will get another list, or another element in the list. You do not need to watch the threads, you only need to wait the threads to complete.

vardhan22 · 04-19-2012, 08:50 AM

Quote:

Originally Posted by pan64

I think you need to divide the work (you can say: there should be a master thread) and all the worker threads will do their jobs. for example a worker thread can sum up 100 elements in a list and stores the result. The master thread will start 10 identical threads just all of them will get another list, or another element in the list. You do not need to watch the threads, you only need to wait the threads to complete.

Yes Sir, this is exactly i want to do.. Can u suggest me how to create this master-worker model of threads.please give me some tutorial regarding that or sample C-code for that.
And also, how to divide the work among master threads.

Thank You Sir,

pan64 · 04-20-2012, 12:55 AM

the "master" thread is usually the first thread with which you start the program (it contains the main() function).
Here you will initialize your list or array of 1000 elements.
Now you will need a function which will sum up 100 elements in an array. This could be a simple loop, something like this:
for (int i=start; i<start+100; i++) sum+= array[i];
the inputs are the value of the start index and the variable to store the result.
now you can call pthread_create to start this function 10 times using ten different start value and ten different variable to store the result.
you can use pthread_join to wait for the threads to finish and finally you need to sum the results and print it.

see the links of tutorials there are really good sample codes to start with.

vardhan22 · 04-20-2012, 03:48 AM

Quote:

Originally Posted by pan64

the "master" thread is usually the first thread with which you start the program (it contains the main() function).
Here you will initialize your list or array of 1000 elements.
Now you will need a function which will sum up 100 elements in an array. This could be a simple loop, something like this:
for (int i=start; i<start+100; i++) sum+= array[i];
the inputs are the value of the start index and the variable to store the result.
now you can call pthread_create to start this function 10 times using ten different start value and ten different variable to store the result.
you can use pthread_join to wait for the threads to finish and finally you need to sum the results and print it.

see the links of tutorials there are really good sample codes to start with.

Ya, i got it.. Thank You, and i succesfully implemented that, and with synchronization among threads.
Thank You for your help