简体   繁体   中英

Format banded matrix for LAPACKE

I'm trying to solve a general banded matrix using the C interface to LAPACK called LAPACKE in Intel's MKL. The function I'm trying to call is *gbsv , where the * denotes the format. Unfortunately, I'm finding it VERY difficult to find working examples on how to format the banded matrix using the C interface. If someone could provide a working example for all the C users out there, I assure you it would be helpful.

The fortran layout is given as an example here , but I'm not exactly sure how I would format this in for input to LAPACKE. I should also note, that in my problem, I have to build the banded matrix on the fly. So I have 5 coefficients, A,B,C,D,E for each i-node, which have to be put into a banded matrix form, then passed to LAPACKE.

The prototype of the function LAPACKE_dgbsv() is the following:

lapack_int LAPACKE_dgbsv( int matrix_layout, lapack_int n, lapack_int kl,
                      lapack_int ku, lapack_int nrhs, double* ab,
                      lapack_int ldab, lapack_int* ipiv, double* b,
                      lapack_int ldb )

The main difference with the function dgbsv() of Lapack is the argument matrix_layout , which can be LAPACK_ROW_MAJOR (C ordering) or LAPACK_COL_MAJOR (Fortran ordering). If LAPACK_ROW_MAJOR , LAPACKE_dgbsv will transpose the matrices, call dgbsv() and then transpose the matrices back to C ordering.

The meaning of the other arguments is the same as for the function dgbsv() . If LAPACK_ROW_MAJOR is used, then the correct ldab for dgbsv() will be computed by LAPACKE_dgbsv() and the argument ldab can be set to n . However, just like dgbsv() , additionnal space must be allocated for the matrix ab to store the details of the factorization.

The following example makes use of LAPACKE_dgbsv() to solve 1D stationnary diffusion by centered finite difference. Null temperature boundary condition are considered and one of a sine wave is used as a source term to check the correctness. The following program is compiled by gcc main3.c -o main3 -llapacke -llapack -lblas -Wall :

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <time.h>

#include <lapacke.h>

int main(void){

    srand (time(NULL));

    //size of the matrix
    int n=10;
    // number of right-hand size
    int nrhs=4;

    int ku=2;
    int kl=2;
    // ldab is larger than the number of bands, 
    // to store the details of factorization
    int ldab = 2*kl+ku+1;

    //memory initialization
    double *a=malloc(n*ldab*sizeof(double));
    if(a==NULL){fprintf(stderr,"malloc failed\n");exit(1);}

    double *b=malloc(n*nrhs*sizeof(double));
    if(b==NULL){fprintf(stderr,"malloc failed\n");exit(1);}

    int *ipiv=malloc(n*sizeof(int));
    if(ipiv==NULL){fprintf(stderr,"malloc failed\n");exit(1);}

    int i,j;

    double fact=1*((n+1.)*(n+1.));
    //matrix initialization : the different bands
    // are stored in rows kl <= j< 2kl+ku+1
    for(i=0;i<n;i++){
        a[(0+kl)*n+i]=0;
        a[(1+kl)*n+i]=-1*fact;
        a[(2+kl)*n+i]=2*fact;
        a[(3+kl)*n+i]=-1*fact;
        a[(4+kl)*n+i]=0;

        //initialize source terms 
        for(j=0;j<nrhs;j++){
            b[i*nrhs+j]=sin(M_PI*(i+1)/(n+1.));
        }
    }
    printf("end ini \n");

    int ierr;


    // ROW_MAJOR is C order, Lapacke will compute ldab by himself.
    ierr=LAPACKE_dgbsv(LAPACK_ROW_MAJOR, n, kl,ku,nrhs, a,n, ipiv,  b,nrhs );


    if(ierr<0){LAPACKE_xerbla( "LAPACKE_dgbsv", ierr );}

    printf("output of LAPACKE_dgbsv\n");
    for(i=0;i<n;i++){
        for(j=0;j<nrhs;j++){
            printf("%g ",b[i*nrhs+j]);
        }
        printf("\n");
    }

    //checking correctness
    double norm=0;
    double diffnorm=0;
    for(i=0;i<n;i++){
        for(j=0;j<nrhs;j++){
            norm+=b[i*nrhs+j]*b[i*nrhs+j];
            diffnorm+=(b[i*nrhs+j]-1./(M_PI*M_PI)*sin(M_PI*(i+1)/(n+1.)))*(b[i*nrhs+j]-1./(M_PI*M_PI)*sin(M_PI*(i+1)/(n+1.)));
        }
    }
    printf("analical solution is 1/(PI*PI)*sin(x)\n");
    printf("relative difference is %g\n",sqrt(diffnorm/norm));


    free(a);
    free(b);
    free(ipiv);

    return 0;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM