简体   繁体   中英

Efficient generation of sparse matrix in a finite-difference solver

I'm writing a program to solve the 3D Schroedinger equation using a finite-difference method. The 1D and 2D versions of my code ran just fine, but in the 3D version, I'm finding that the generation of the matrix (for those of you who know QM, this is the Hamiltonian matrix; for those who don't, it's not important) is taking by far the most time (minutes for a typical grid spacing, versus seconds for all the other operations, including the smallest-eigenvalue-finder!).

I was wondering if anyone has any suggestions for how to write my matrix generation more efficiently. I include two versions of my code below: one that should be relatively easy to understand, and then a second version that followed MATLAB's documentation advice that I should not directly index entries when making a sparse matrix, but should rather make three vectors (row and column indices, and their respective values) and generate the sparse matrix from them. Unfortunately the latter hasn't helped speed things up at all, because I'm still using a silly triply-nested loop, and I can't think of a good way to avoid it.

delta = 0.1e-9;
Lx = 2e-9;
x = 0:delta:Lx;
Nx = length(x);
Ly = 2e-9;
y = 0:delta:Ly;
Ny = length(y);
Lz = 2e-9;
z = 0:delta:Lz;
Nz = length(z);

map = inline('((idx_x-1) * Ny*Nz) + ((idx_y-1) * Nz) + idx_z','idx_x','idx_y','idx_z','Ny','Nz'); % define an inline helper function for mapping (x,y,z) indices to a linear index

Tsparse = sparse([],[],[],Nx*Ny*Nz, Nx*Ny*Nz, 7*(Nx-2)*(Ny-2)*(Nz-2)); % kinetic part of Hamiltonian matrix: (d^2/dx^2 + d^2/dy^2 + d^2/dz^2); NOTE: we'll have 7*(Nx-2)*(Ny-2)*(Nz-2) non-zero entries in this matrix, so we get the sparse() function to preallocate enough memory for this

for idx_x = 2:Nx-1
    for idx_y = 2:Ny-1
        for idx_z = 2:Nz-1
            Tsparse( map(idx_x,idx_y,idx_z,Ny,Nz) , map(idx_x ,idx_y , idx_z , Ny, Nz) ) = -6/delta^2;
            Tsparse( map(idx_x,idx_y,idx_z,Ny,Nz) , map(idx_x+1,idx_y , idx_z , Ny, Nz) ) = 1/delta^2;
            Tsparse( map(idx_x,idx_y,idx_z,Ny,Nz) , map(idx_x-1,idx_y , idx_z , Ny, Nz) ) = 1/delta^2;
            Tsparse( map(idx_x,idx_y,idx_z,Ny,Nz) , map(idx_x ,idx_y+1, idx_z , Ny, Nz) ) = 1/delta^2;
            Tsparse( map(idx_x,idx_y,idx_z,Ny,Nz) , map(idx_x ,idx_y-1, idx_z , Ny, Nz) ) = 1/delta^2;
            Tsparse( map(idx_x,idx_y,idx_z,Ny,Nz) , map(idx_x ,idx_y , idx_z+1, Ny, Nz) ) = 1/delta^2;
            Tsparse( map(idx_x,idx_y,idx_z,Ny,Nz) , map(idx_x ,idx_y , idx_z-1, Ny, Nz) ) = 1/delta^2;
        end
   end
end

This code makes a matrix that only has non-zero entries along 7 diagonals (and not all of the entries in each of those diagonals are non-zero).

Here's the version of the code where I attempted create the T matrix in a way that is a little closer to how MATLAB's documentation suggested I do it:

delta = 0.1e-9;
Lx = 2e-9;
x = 0:delta:Lx;
Nx = length(x);
Ly = 2e-9;
y = 0:delta:Ly;
Ny = length(y);
Lz = 2e-9;
z = 0:delta:Lz;
Nz = length(z);

map = inline('((idx_x-1) * Ny*Nz) + ((idx_y-1) * Nz) + idx_z','idx_x','idx_y','idx_z','Ny','Nz'); % define an inline helper function for mapping (x,y,z) indices to a linear index

Iidx = zeros(7*(Nx-2)*(Ny-2)*(Nz-2),1); % matrix row indices
Jidx = zeros(7*(Nx-2)*(Ny-2)*(Nz-2),1); % matrix col indices
vals = zeros(7*(Nx-2)*(Ny-2)*(Nz-2),1); % matrix non-zero values, corresponding to (row,col)
cnt = 1;
for idx_x = 2:Nx-1
    for idx_y = 2:Ny-1
        for idx_z = 2:Nz-1
            % Tsparse( map(idx_x,idx_y,idx_z,Ny,Nz) , map(idx_x ,idx_y , idx_z , Ny, Nz) ) = -6/delta^2;
            Iidx(cnt) = map(idx_x,idx_y,idx_z,Ny,Nz);
            Jidx(cnt) = map(idx_x,idx_y,idx_z,Ny,Nz);
            vals(cnt) = -6/delta^2;
            cnt = cnt + 1;
            % Tsparse( map(idx_x,idx_y,idx_z,Ny,Nz) , map(idx_x+1,idx_y , idx_z , Ny, Nz) ) = 1/delta^2;
            Iidx(cnt) = map(idx_x,idx_y,idx_z,Ny,Nz);
            Jidx(cnt) = map(idx_x+1,idx_y,idx_z,Ny,Nz);
            vals(cnt) = 1/delta^2;
            cnt = cnt + 1;
            % Tsparse( map(idx_x,idx_y,idx_z,Ny,Nz) , map(idx_x-1,idx_y , idx_z , Ny, Nz) ) = 1/delta^2;
            Iidx(cnt) = map(idx_x,idx_y,idx_z,Ny,Nz);
            Jidx(cnt) = map(idx_x-1,idx_y,idx_z,Ny,Nz);
            vals(cnt) = 1/delta^2;
            cnt = cnt + 1;
            % Tsparse( map(idx_x,idx_y,idx_z,Ny,Nz) , map(idx_x ,idx_y+1, idx_z , Ny, Nz) ) = 1/delta^2;
            Iidx(cnt) = map(idx_x,idx_y,idx_z,Ny,Nz);
            Jidx(cnt) = map(idx_x,idx_y+1,idx_z,Ny,Nz);
            vals(cnt) = 1/delta^2;
            cnt = cnt + 1;
            % Tsparse( map(idx_x,idx_y,idx_z,Ny,Nz) , map(idx_x ,idx_y-1, idx_z , Ny, Nz) ) = 1/delta^2;
            Iidx(cnt) = map(idx_x,idx_y,idx_z,Ny,Nz);
            Jidx(cnt) = map(idx_x,idx_y-1,idx_z,Ny,Nz);
            vals(cnt) = 1/delta^2;
            cnt = cnt + 1;
            % Tsparse( map(idx_x,idx_y,idx_z,Ny,Nz) , map(idx_x ,idx_y , idx_z+1, Ny, Nz) ) = 1/delta^2;
            Iidx(cnt) = map(idx_x,idx_y,idx_z,Ny,Nz);
            Jidx(cnt) = map(idx_x,idx_y,idx_z+1,Ny,Nz);
            vals(cnt) = 1/delta^2;
            cnt = cnt + 1;
            % Tsparse( map(idx_x,idx_y,idx_z,Ny,Nz) , map(idx_x ,idx_y , idx_z-1, Ny, Nz) ) = 1/delta^2;
            Iidx(cnt) = map(idx_x,idx_y,idx_z,Ny,Nz);
            Jidx(cnt) = map(idx_x,idx_y,idx_z-1,Ny,Nz);
            vals(cnt) = 1/delta^2;
            cnt = cnt + 1;

        end
    end
end
Tsparse = sparse(Iidx, Jidx, vals, Nx*Ny*Nz, Nx*Ny*Nz);

Thanks in advance for any suggestions!

-- dx.dy.dz

(Side note: The "map" function is used to go from a 3D coordinate system (x,y,z) to a 1D value. Suppose my eigenvalue problem is H psi = E psi, where H is the Hamiltonian matrix, and psi is a vector, and E is a scalar. The matrix H = T + V (V is not shown in the code sample, only T is) is written in a basis where the 3D psi functions are discretized and collapsed from 3D to 1D. For example, imagine I only have 2 grid-points per dimension, so x=1:1:2, y=1:1:2, z=1:1:2. Then my Hamiltonian is written in the basis {psi(1,1,1), psi(1,1,2), psi(1,2,1), psi(1,2,2), psi(2,1,1), psi(2,1,2), psi(2,2,1), psi(2,2,2)}, ie it is an 8-by-8 matrix. An eigenvector psi that the eigs() solver outputs will be an 8-component vector, that I can then reshape back to a 2x2x2 matrix if I want.)

I think I can give a few pointers:

  • Instead of your own map you might consider the sub2ind function

  • You repeatedly call map(idx_x,idx_y,idx_z,Ny,Nz) - for sure you can store it for reuse.

  • Also the relative position of neighbors will stay the same - no need to recalculate

As small example I'd do it like this:

siz = [4,4,4];

pos = sub2ind(siz,1,1,1)

tmp = [
    sub2ind(siz,2,1,1)-pos
    sub2ind(siz,1,2,1)-pos
    sub2ind(siz,1,1,2)-pos
    ];

neighbors = [tmp;-tmp];
%%
big_dim = prod(siz);
mat = sparse(big_dim,big_dim);
%%
for i=2:siz(1)-1
    for j=2:siz(2)-1
        for k=2:siz(3)-1
            c_pos = sub2ind(siz,i,j,k);
            mat(c_pos,c_pos)=-6;
            c_neighbors=c_pos+neighbors;
            mat(c_pos,c_neighbors)=1;
        end
    end
end

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM