Latest LQ Deal: Complete CCNA, CCNP & Red Hat Certification Training Bundle
Go Back > Forums > Linux Forums > Linux - Newbie
User Name
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!


  Search this Thread
Old 10-31-2011, 02:09 PM   #1
LQ Newbie
Registered: Oct 2011
Posts: 5

Rep: Reputation: Disabled
getting wrong values in matrix multiplication

this is my program

#include <stdio>
#include <cuda>
#include <time>
#include <conio>
#define N 200
#define TILE_WIDTH 20

__global__ void MatMul(int*A, int* B, int* C) {

int sum;
int idx = threadIdx.x;
int idy = threadIdx.y;
int bx = blockIdx.x;
int by = blockIdx.y;
int k ,uidx , uidy , i;
uidx = bx*TILE_WIDTH + idx;
uidy = by*TILE_WIDTH + idy;
sum = 0;

// Allocating memory in shared memory

__shared__ int temp1[TILE_WIDTH][TILE_WIDTH];
__shared__ int temp2[TILE_WIDTH][TILE_WIDTH];

//copying the data to shared memory

for( i =0;i<N/TILE_WIDTH; i++)
temp1[idy][idx]= A[TILE_WIDTH*(by*N+i) + idx+idy*N];
temp2[idy][idx]= B[TILE_WIDTH*(bx+N*i) + idx+idy*N];

// multiplying matrices in shared memory

for(k=0 ; k < TILE_WIDTH;k++) {
sum = sum + temp1[idy][k]*temp2[k][idx];

// synchronizing the threads

C[uidy*N + uidx] = sum;

int main( void ) {

int a[N][N], b[N][N], c[N][N]; //host copies of a,b,c

int *dev_a, *dev_b, *dev_c; //device copies of a,b,c

// allocate the memory on the GPU
cudaMalloc( (void**)&dev_a, N * N * sizeof(int) );
cudaMalloc( (void**)&dev_b, N * N * sizeof(int) );
cudaMalloc( (void**)&dev_c, N * N * sizeof(int) );

// fill the matrices 'a' and 'b' on the CPU

for (int i=0; i<N; i++) {
for (int j=0; j < N; j++) {
a[i][j] = j+3;
b[i][j] = i+6;

//copy above a,b values to device

cudaMemcpy( dev_a, a, N * N * sizeof(int), cudaMemcpyHostToDevice );
cudaMemcpy( dev_b, b, N * N * sizeof(int), cudaMemcpyHostToDevice );
// Prepare timer
cudaEvent_t start, stop;
float time;


//start record
cudaEventRecord(start, 0);

// Kernel invocation with N threads
dim3 dimGrid(10,10,1);
dim3 dimBlock(TILE_WIDTH,TILE_WIDTH,1);
MatMul<<<dimGrid>>> (dev_a, dev_b, dev_c);

//stop record
cudaEventRecord(stop, 0);

//this is operation time
cudaEventElapsedTime(&time, start, stop);

//clean up

//copy result to host
cudaMemcpy(c, dev_c, N * N * sizeof(int), cudaMemcpyDeviceToHost );

for (int i=0; i < N; i++){
for (int j=0; j < N; j++){

printf( "%d ", c[i][j]);


//free the allocated memory in device
cudaFree( dev_a );
cudaFree( dev_b );
cudaFree( dev_c );
printf("\n multiplication done!!!\n");
printf(" time elapsed in ms=%f\n",time);
return 0;

i am getting a matrix of value 2829400
i checked in matlab the value should be a matrix of value 2871200
Old 10-31-2011, 03:09 PM   #2
Registered: Sep 2008
Location: The Netherlands
Distribution: Slackware64 current
Posts: 592

Rep: Reputation: 140Reputation: 140
Is this correct
temp1[idy][idx]= A[TILE_WIDTH*(by*N+i) + idx+idy*N];
temp2[idy][idx]= B[TILE_WIDTH*(bx+N*i) + idx+idy*N];
0 members found this post helpful.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Parallel matrix - matrix multiplication seg-faults ejspeiro Programming 9 04-18-2011 09:41 PM
Octave - Load a text file with floating point values into a matrix Gavin Harper Programming 2 01-30-2011 12:23 AM
Matrix Multiplication on 6800 Instuction set shadow85 Programming 1 09-10-2008 09:50 PM
an existing efficient matrix multiplication algorithm? George2 Programming 2 10-16-2006 12:54 AM
Parallel matrix multiplication abdobl Programming 3 09-22-2004 06:11 AM > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 02:39 AM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration