## Summary:

Trying to find an optimal way of solving a pentadiagonal matrix solver using a self-written solver or using the LAPACK library.

## Main Question or Discussion Point

Hello,

I'm currently writing a numerical simulation code for solving 2D steatdy-state heat conduction problems (diffusion equation). After reading and following these two book references (Numerical Heat Transfer and Fluid Flow from Patankar and And Introduction to Computational Fluid Dynamics from Versteerd and Malalasekera), I ended up with a penta-diagonal matrix to solve (I'm not sure this is the right name for it). The band corresponds to the coefficient at the west, east, north and south direction of the nodes.

The discretized equation looks like :
ap * Tp = aw * Tw + ae * Te + an * Tn + as * Ts + Su
ap = aw + ae + as + an - Sp

The final version of this code will be written in Golang as it will be linked to an existing numerical code. I planning to write myself the solver or use the LAPACK library. But to do some trial and tests, I've first written it using Matlab.

Both books are proposing to solve the problem using a line-by-line method. Simply using the TDMA (Thomas algorithm) looping on each column then on each line until we reach convergence.

I've also implemented some other iterative algorithms (Gauss-seidel and Jacobi).

For the same tolerance norm(A*x-b) <= 1e-4, I obtain the following performances :

Number of elements - >16006400160000
Line-by-line TDMA1.05s13.25sx
Gauss-seidel87.57sxx
Jacobi226.9sxx
Direct (Matlab A \ B)0.004480s0.0209s0.65s

I am pretty sure I am obtaining bad results for the Gauss-seidel and Jacobi method because these algorithms can solve any problems with non-singular and non-zero diagonal A matrices, while I have a simple sparse matrix and I don't take advantages of this.

The line-by-line TDMA is much faster than the two other ones. However, if I set up correctly the sparse matrix in Matlab, the "A \ B" solve it much faster than any other solution. My question is then: How can I obtain similar performances using self-written solver or LAPACK library solver/tools ?

I'm looking for insight are advices not of proof of concept of the best solution =)

Thank you