[SOLVED] Solving a banded system of equations from C using LAPACK's DGBSV

ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Welcome to LinuxQuestions.org, a friendly and active Linux Community.

You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!

Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.

If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.

Having a problem logging in? Please visit this page to clear all LQ-related cookies.

Introduction to Linux - A Hands on Guide

This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.

Click Here to receive this Complete Guide absolutely free.

Solving a banded system of equations from C using LAPACK's DGBSV

I have tried to search for a similar thread related to this topic, but nobody seems to care for banded systems (...)

I am interested in solving a banded matrix using LAPACK/ScaLAPACK from a C code. First, I want to achieve a sequential solution with LAPACK, before attempting anything with ScaLAPACK.

Problem: The row-major/col-major difference between both languages seems to be affecting my solution process. Here's the system I intend to solve:

The following code, translates that matrix into LAPACK's banded data structure, specified in here.

Code:

int rr = 6; // Rank.
int kl = 2; // Number of lower diagonals.
int ku = 1; // Number of upper diagonals.
int nrhs = 1; // Number of RHS.
double vals[36] = {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, // Req. ex. space.
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, // Req. ex. space.
666.0, 0.0, 0.0, 0.0, 0.0, 22.5, // First up diag.
1.0, -50.0, -50.0, -50.0, -50.0, -2.6, // Main diagonal.
27.5, 27.5, 27.5, 27.5, 4.0, 666.0, // First low diag.
0.0, 0.0, 0.0, -1.0, 666.0, 666.0}; // 2nd low diag.
int lda = rr; // Leading dimension of the matrix.
int ipiv[6]; // Information on pivoting array.
double rhs[] = {1.0, 1.0, 1.0, 1.0, 1.0, 0.0}; // RHS.
int ldb = lda; // Leading dimension of the RHS.
int info = 0; // Evaluation variable for solution process.
int ii; // Iterator.
int jj; // Iterator.
dgbsv_(&rr, &kl, &ku, &nrhs, vals, &lda, ipiv, rhs, &ldb, &info);
printf("info = %d\n", info);
for (ii = 0; ii < ldb; ii++) {
printf("%f\n", rhs[ii]);
}
putchar('\n');

As I said, I am worried that the way I am translating my matrix, is incorrect given the col-major nature, as well as the indexing nature of Fortran, since my solution yields:

Code:

[ejspeiro@node01 lapack-ex02]$ make runs
`pwd`/blogs < blogs.in
info = 1
1.000000
1.000000
1.000000
1.000000
1.000000
0.000000

info = 1, implies that factorization was completed, but U(i,i) = U(1,1) was 0.

Furthermore, since the solution corresponds to a steady-state diffusive - advective process with constant reactive term, the actual solution looks like this:

Yes, you are right.. Yesterday I looked at the wrong variable when composing the message above (I accidentally posted the product of the matrix and the solution I found, i.e. r.h.s vector). Maybe the only excuse for me is that I'm ill now and sit at home with high temperature. I'm sorry.

I figured out what is wrong with your code. First, your upper diagonal is entered incorrectly ([0,0,0,0,22.5] instead of [0, 22.5, 22.5, 22.5, 22.5]). Second, you must transpose the matrix before passing it to fortran wrapper routine. Here is the code with my modifications:

Code:

#include <stdio.h>
main()
{
int rr = 6; // Rank.
int kl = 2; // Number of lower diagonals.
int ku = 1; // Number of upper diagonals.
int nrhs = 1; // Number of RHS.
/*
double vals[36] = {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, // Req. ex. space.
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, // Req. ex. space.
666.0, 0.0, 0.0, 0.0, 0.0, 22.5, // First up diag.
1.0, -50.0, -50.0, -50.0, -50.0, -2.6, // Main diagonal.
27.5, 27.5, 27.5, 27.5, 4.0, 666.0, // First low diag.
0.0, 0.0, 0.0, -1.0, 666.0, 666.0}; // 2nd low diag.
*/
double vals[36] = {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, // Req. ex. space.
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, // Req. ex. space.
666.0, 0.0, 22.5, 22.5, 22.5, 22.5, // First up diag.
1.0, -50.0, -50.0, -50.0, -50.0, -2.6, // Main diagonal.
27.5, 27.5, 27.5, 27.5, 4.0, 666.0, // First low diag.
0.0, 0.0, 0.0, -1.0, 666.0, 666.0}; // 2nd low diag.
int lda = rr; // Leading dimension of the matrix.
int ipiv[6]; // Information on pivoting array.
double rhs[] = {1.0, 1.0, 1.0, 1.0, 1.0, 0.0}; // RHS.
int ldb = lda; // Leading dimension of the RHS.
int info = 0; // Evaluation variable for solution process.
int ii; // Iterator.
int jj; // Iterator.
double AB[36];
// transpose `vals' matrix
for(ii=0; ii<lda; ii++)
for(jj=0; jj<lda; jj++)
AB[jj+lda*ii] = vals[ii+lda*jj];
dgbsv_(&rr, &kl, &ku, &nrhs, AB, &lda, ipiv, rhs, &ldb, &info);
printf("info = %d\n", info);
for (ii = 0; ii < ldb; ii++) {
printf("%.9f\n", rhs[ii]);
}
putchar('\n');
}

Excellent! Yes! I had come into that solution of transposing the "stack of arrays". However, I wonder, how would the transposition occur if the resulting CORE matrix were not to be squared. Take for example:

Can you provide complete example which yields an error and the error message? Your code looks good to me and should work equally well on non-square matrix, provided that arrays in and out are of correct dimensions.

Sight u.u' you are right... it would be helpful... let us see... I have created an API for this data type, so I apologize for any incorrect programming practice... although, hopefully, I will improve my code writing by means of your suggestions!

My project code is comprise of 4 files total. I am still working on these files! Feel free to highlight any error, but hopefully, I would have also spotted it

happened to work correctly on a 64 bit system, but it will cause mysterious errors on a 32 bit system, where sizeof(double) != sizeof(double*). The same error repeated when you allocate out->so_memory_holder in Transpose(). You should write sizeof(CMTK_Number) in these places.

3) In my (not very big) experience 2D arrays (arrays of arrays) are not very convenient for representing two dimensional (and more) matrices in C. They are PITA to allocate, deallocate and carry around. I would use linear representation instead. Moreover, LAPACK functions also seem to prefer linear representation.

4) If Transpose() operates on general band matrices, then it may do two a little bit different things:
A) Transpose a band matrix (basically exchange its diagonals)
B) Transpose the core of a band matrix (i.e. transpose a compact representation of a band matrix) to be able pass such matrix to LAPACK routine.

It seems that in its current state your code does not make distinction between these two cases and even mix them a little bit.

5) It is a matter of taste, I know, but why not omit these redundant curly braces?

Dear firstfire: Thanks! This has been an incredibly helpful post!

1) Done! Gratefully, I had already spotted that!
2) You are right! In fact, I have made the adequate changes throughout the code!
3) Again, you are right! I have changed the entire implementation to rely on a 1-d array!

4) ...
4) ...
4) Sight! u.u' This one is still giving me problems I am not able to get a solution for cases in where I get a non-squared CORE matrix /* Notice that the FACADE matrices WILL always be squared, since we are solving for a PDE, but the cores can be non-squared. */ Say 1 PDE, step size of 0.1, as in

I have grown aware of the differences between transposing the banded matrix and actually transposing its core, (thanks for your help with that), and I have developed these two methods (under debugging):

Code:

/*!
Monday, February 18, 2013
This methods transposes a given matrix.
*/
CMTK_Boolean Transpose(CMTK_BandedMatrix *in, CMTK_BandedMatrix *out) {
CMTK_Boolean cool; /*!< Validation variable. */
int ii; /*!< Iterator. */
int jj; /*!< Iterator. */
int acum_01; /*!< Accumulator. */
/*
This conditional is written for safety reasons, in case the user forgets to
initialize the matrix to be constructed before calling THIS constructor. If
she initializes the pointer, then this call will NOT do it, thus skyping
over the conditional. If she does NOT initialize the pointer, this constructor
will do it, thus ensuring the code will not break, BUT it will yield the
following error from Valgrind:
==22540== HEAP SUMMARY:
==22540== in use at exit: 0 bytes in 0 blocks
==22540== total heap usage: 3 allocs, 3 frees, 272 bytes allocated
==22540==
==22540== All heap blocks were freed -- no leaks are possible
==22540==
==22540== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)
==22540==
==22540== 1 errors in context 1 of 1:
==22540== Conditional jump or move depends on uninitialised value(s)
==22540== at 0x400778: ConstructFromBandedMatrixInfo
==22540== (in /home/ejspeiro/lapack-ex02/blogs)
==22540== by 0x400E94: main (in /home/ejspeiro/lapack-ex02/blogs)
==22540== Uninitialised value was created by a stack allocation
==22540== at 0x400E52: main (in /home/ejspeiro/lapack-ex02/blogs)
*/
if (out == NULL) {
out = (CMTK_BandedMatrix *) malloc(sizeof(CMTK_BandedMatrix));
if (out == NULL) {
return CMTK_FALSE;
}
}
//! Properties of the matrix's CORE:
out->co_expanded = in->co_expanded;
out->co_num_rows = in->co_num_cols;
out->co_num_cols = in->co_num_rows;
out->co_kl = in->co_ku;
out->co_ku = in->co_kl;
//! Build the matrix's CORE:
// TODO: (0x02).
out->co_values = (CMTK_Number *)
calloc(out->co_num_rows*out->co_num_cols, sizeof(CMTK_Number));
if (out->co_values == NULL) {
free(out);
return CMTK_FALSE;
}
out->co_leading_dim = out->co_ku + 2*out->co_kl + 1;
//! Properties of the matrix's FACADE:
out->fa_bcs = in->fa_bcs;
out->fa_num_rows = in->fa_num_cols;
out->fa_num_cols = in->fa_num_rows;
out->fa_rank = in->fa_rank;
out->fa_num_values = in->fa_num_values;
out->fa_num_non_zero_values = in->fa_num_non_zero_values;
// Compute fa_num_significant_values:
acum_01 = 0;
for (ii = 1; ii <= out->co_kl; ii++) {
acum_01 = acum_01 + (out->fa_rank - ii);
}
acum_01 = 2*acum_01 + out->fa_rank;
out->fa_num_significant_values = acum_01;
out->fa_bandwidth = in->fa_bandwidth;
// Computing the densities and the sparsities:
// The ABSOLUTE DENSITY is computed based on the absolute number of non-zero
// values, which we are no interested in computing in advanced. We will update
// those as we add elements, therefore, we will also update this property, as
// we add the values:
out->fa_abs_density = 0.0;
out->fa_rel_density =
((CMTK_Number) out->fa_num_significant_values)/out->fa_num_values;
out->fa_abs_sparsity = 1.0 - out->fa_abs_density;
out->fa_rel_sparsity = 1.0 - out->fa_rel_density;
// Perform the actual transposition:
for (ii = 0; ii < in->co_num_rows; ii++) {
for (jj = 0; jj < in->co_num_cols; jj++) {
cool = Set(out, jj, ii, GetOrDie(in, ii, jj));
if (!cool) {
fprintf(stderr, "COULD NOT add value at %d %d! Out of band?\n", ii, jj);
return CMTK_FALSE;
}
}
}
return CMTK_TRUE;
}

And:

Code:

/*!
Tuesday, February 19, 2013
This methods gets the core array from a matrix.
*/
CMTK_Number* GetCoreValues(CMTK_BandedMatrix *in, char trans) {
CMTK_Number *out;
int ii;
int jj;
out = (CMTK_Number*)
calloc(in->co_num_rows*in->co_num_cols, sizeof(CMTK_Number));
if (out == NULL) {
return NULL;
}
switch (trans) {
case 'T': {
for (ii = 0; ii < in->co_num_rows; ii++) {
for (jj = 0; jj < in->co_num_cols; jj++) {
out[ii + jj*in->co_num_rows] = in->co_values[ii*in->co_num_cols + jj];
}
}
break;
}
case 'N': {
for (ii = 0; ii < in->co_num_rows; ii++) {
for (jj = 0; jj < in->co_num_cols; jj++) {
out[ii*in->co_num_cols + jj] = in->co_values[ii*in->co_num_cols + jj];
}
}
break;
}
default: {
return NULL;
break;
}
}
return out;
}

I also don't really get the grasp of the concept of a leading dimensions :$ Since I think I might be getting some sort of error when transposing.

Thanks in advanced!

Last edited by ejspeiro; 02-19-2013 at 10:56 PM.
Reason: Updating 1st posted source code

I believe the issue arises when accessing the values in the soon-to-be transposed matrices, since I can only do it within the scope of the band (because the core is entirely pre-allocated), and for some reason, there are cases in where I add in locations outside of the scope of the band.

May there be something funky with the way I define the limits of my cores?

LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.