Can Different Access Patterns Improve a Matrix's Condition Number?

  • Thread starter Thread starter nurfherder
  • Start date Start date
  • Tags Tags
    Condition Matrix
nurfherder
Messages
4
Reaction score
0
Hello all,

I am new to this forum but am glad I found it, I have a quick question about condition numbers and order of operations.

Given a symmetric positive-definitive matrix with a initial condition number α, is it possible to improve that condition number with a different access pattern? For example, if I access the matrix within the context of an iterative solver (e.g., Jacobin) in column-major order would it improve the condition number over access done in row-major order?

I am doing some personal research into iterative solvers and convergence rates and I would like to know if the condition number can be improved, thus lower the total number of iterations to converge, significantly with something so small.

Thank you.
 
Physics news on Phys.org
Never mind about my initial post, the order of operations as applied to column major versus row major will have no effect on the condition number of the matrix. The access order should not effect the Eigenvalue Spectral Radius.

Does anyone have any clue as to why a iterative algorithm, such as Jacobin, would have less iterations to converge when executed on the GPU versus the CPU? The model and tolerance is exactly the same in both cases, so I cannot understand how the GPU has less iterations using a Krylov search space. I have executed the SAME code for CPU and GPU (except of course the CPU has NO calls to the GPU) on two different sets of CPUs and GPUs (one double precision - Tesla, and one not - Quadro) and get exactly the same result.

Any ideas would be great, I think I might have broke one of my neurons on this one.

Thanks.
 
nurfherder said:
Does anyone have any clue as to why a iterative algorithm, such as Jacobin, would have less iterations to converge when executed on the GPU versus the CPU? The model and tolerance is exactly the same in both cases, so I cannot understand how the GPU has less iterations using a Krylov search space.

That doesn't make any sense to me. If you do the EXACT same operations, you should get exactly the same results.

The explanation may be something to do with compiler optimisation, compiler bugs, library routines, different implementations of floating point arithmetic, etc. The only way to nail that is compare your two calculations step by step. If there are differences, they will probably show up (if only in the last decimal place) on small matrices as well as on big ones.
 
You are right - it doesn't make sense to me either. I was just wondering if there was an obvious and therefore easy reason.

Thank you for your help and time.
 
I found the problem.

Turns out that the GPU typically has some round-off error that will benefit the GPU for iterative solvers such that the higher precision of the CPU will take more iterations to converge. The small inaccuracies of the GPU become magnified when doing large sets of summations - such as those found in the Dot-product of the iterative solver I am using.

It is sneaky and is typically solved by using double precision (CUDA arch. 1.3 or greater) or algorithmically with the Kahan approximation.
 
nurfherder said:
Turns out that the GPU typically has some round-off error that will benefit the GPU for iterative solvers such that the higher precision of the CPU will take more iterations to converge.

Hm... long before the days of GPUs, I remember a CFD software guru trying to convince me that his code worked better in 32 bit precision arithmetic than in 64 (In fact it didn't work at all in 64, on most problems).

Maybe he gave up trying to sell his CFD software and went into GPU design ... :smile:
 
##\textbf{Exercise 10}:## I came across the following solution online: Questions: 1. When the author states in "that ring (not sure if he is referring to ##R## or ##R/\mathfrak{p}##, but I am guessing the later) ##x_n x_{n+1}=0## for all odd $n$ and ##x_{n+1}## is invertible, so that ##x_n=0##" 2. How does ##x_nx_{n+1}=0## implies that ##x_{n+1}## is invertible and ##x_n=0##. I mean if the quotient ring ##R/\mathfrak{p}## is an integral domain, and ##x_{n+1}## is invertible then...
The following are taken from the two sources, 1) from this online page and the book An Introduction to Module Theory by: Ibrahim Assem, Flavio U. Coelho. In the Abelian Categories chapter in the module theory text on page 157, right after presenting IV.2.21 Definition, the authors states "Image and coimage may or may not exist, but if they do, then they are unique up to isomorphism (because so are kernels and cokernels). Also in the reference url page above, the authors present two...
When decomposing a representation ##\rho## of a finite group ##G## into irreducible representations, we can find the number of times the representation contains a particular irrep ##\rho_0## through the character inner product $$ \langle \chi, \chi_0\rangle = \frac{1}{|G|} \sum_{g\in G} \chi(g) \chi_0(g)^*$$ where ##\chi## and ##\chi_0## are the characters of ##\rho## and ##\rho_0##, respectively. Since all group elements in the same conjugacy class have the same characters, this may be...
Back
Top