LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
LinkBack Search this Thread
Old 01-23-2012, 11:50 AM   #1
PB0711
Member
 
Registered: Aug 2004
Location: London, UK
Distribution: Ubuntu 10.10, ubuntu 11.04, suse 9.2, OSX
Posts: 259

Rep: Reputation: 30
Correlation in C with matrix


Hello all,

I'm a higher level programmer, i.e. R, perl, matlab etc. I want to get started with some C coding but I'm finding it pretty difficult. I would like to interface my R program with some C code, but first I need the C code.

I have a matrix 6783x102 (rowXcol) and I want to get the correlation of the two rows if some criteria are met. So in pseudo code

Code:
for(int i=0: i > numRows; i++){
  for(int j=0; j> numRows; j++){
    if(VectorCriteria[i] > kon+VectorCriteria[j]){
      do correlation mat[i][ ] vs mat[j][ ]
    }
  }
}
So my question is with C how is the best way to represent a matrix or ow is the best way to do this. For the correlation I think I can probably find a library.

Thanks for the help,

Paul
 
Old 01-23-2012, 12:03 PM   #2
johnsfine
Senior Member
 
Registered: Dec 2007
Distribution: Mepis, Centos
Posts: 4,012

Rep: Reputation: 731Reputation: 731Reputation: 731Reputation: 731Reputation: 731Reputation: 731Reputation: 731
Quote:
Originally Posted by PB0711 View Post
with C how is the best way to represent a matrix
If the matrix size is known at compile time, it is best (performance and simplicity) to represent the matrix as an array of arrays.

Code:
void do_correlation(double* x, double*y);
...
enum {rows=6783, columns=102};
double mat[rows][columns];
...
for (int i=0; i<rows; i++)
   for (int j=0; j<rows; j++)
      if (...)
         do_correlation( mat[i], mat[j] );
Notes:
I used enum for defining integer compile time constants. If you don't like that, use some other method, but enum has advantages.
In C, arrays are passed as pointers, which can be confusing. I pre declared do_correlation as taking its inputs by double* to make the true parameter passing method is more explicit.

If the number of rows is not known at compile time, that makes the above approach a bit trickier. If the number of columns is not known at compile time, that is a much bigger problem for the above approach. You probably want a more complicated way to represent the matrix if the number of columns is not known at compile time.

BTW, why C rather than C++? C++ makes things like this easier and is not necessarily any less efficient than C.

Last edited by johnsfine; 01-23-2012 at 12:12 PM.
 
1 members found this post helpful.
Old 01-24-2012, 09:01 PM   #3
ta0kira
Senior Member
 
Registered: Sep 2004
Distribution: Slackware64 13.37, Kubuntu 10.04
Posts: 2,944

Rep: Reputation: Disabled
Quote:
Originally Posted by PB0711 View Post
Code:
for(int i=0: i > numRows; i++){
  for(int j=0; j> numRows; j++){
    if(VectorCriteria[i] > kon+VectorCriteria[j]){
      do correlation mat[i][ ] vs mat[j][ ]
    }
  }
}
This will be very inefficient. Using R code, the correlation matrix, where element (m,n) is the correlation coefficient between row m and row n of the original matrix, can be computed as follows:
Code:
vals <- #the matrix you described
vals.covar <- (function(x) x %*% t(x) / ncol(x))(vals) #the covariance
vals.corr  <- (function(x) diag( 1/sqrt(diag(x)) ) %*% x %*% diag( 1/sqrt(diag(x)) ))(vals.covar) #the correlation matrix
You can break this down into matrix multiplication (%*% R operator) in C, and since vals.covar and vals.corr are by definition symmetric you only need to compute half the elements. The computation of vals.corr is just a normalization of each row and column using the square-root of the corresponding diagonal element, leaving the diagonal elements equal to 1.
Quote:
Originally Posted by PB0711 View Post
I would like to interface my R program with some C code, but first I need the C code.
If you're going to write an R extension in C you'll need to write a wrapper in R, anyway. You might as well design it so you only compute in C what can't be adequately computed in R, e.g. compute the correlation in R and pass it to a C function that expects it to already be computed.
Kevin Barry

Last edited by ta0kira; 01-24-2012 at 09:03 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Parallel matrix - matrix multiplication seg-faults ejspeiro Programming 9 04-18-2011 09:41 PM
is there a matrix screensaver, very exactly like in the Matrix movie? frenchn00b Linux - Desktop 2 08-20-2009 10:00 AM
awk convert column matrix to square matrix? johnpaulodonnell Programming 4 04-30-2008 01:45 PM
LXer: IPED study finds strong correlation between Linux focus and profitability LXer Syndicated Linux News 0 08-15-2006 12:21 PM
how data correlation between snort and nessus works? jarien Linux - Security 0 11-27-2004 01:32 AM


All times are GMT -5. The time now is 08:47 AM.

Main Menu
 
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: @linuxquestions
Open Source Consulting | Domain Registration